Disclosure of Invention
The application aims to provide a computing resource scheduling method, device and system based on bare metal or a container, which are used for predicting the real-time running condition of a target computing power resource group through a pre-trained safe water line prediction model, determining a current stock safe water line, then deploying based on the water line, starting a monitoring task, and judging whether a release instance is needed or not according to real-time stock information and a preset safe water line corresponding threshold value when a user request is monitored so as to reasonably schedule resources, ensure the stability of the key task and improve the service quality and the user experience.
According to the method, a monitoring task is started to monitor the use data of target computing power resources and system resources in real time and convert the use data into inventory information, the inventory information at least comprises the total group number and the idle group number corresponding to the target computing power resource groups, when a newly added user resource creation request is monitored, the user resource creation request is analyzed to obtain a target priority label, if the target priority label is a high priority label, the ratio of the idle group number to the total group number is judged to be not larger than the ratio of the preset safety water line to represent the minimum value of the case resource demands of the high priority target computing power resource groups, the system is started according to the inventory safety water line, the monitoring task is started to monitor the use data of the target computing power resources and the system resources in real time, the inventory information at least comprises the total group number and the idle group number corresponding to the target computing power resource groups, the target priority label is obtained by analyzing the user resource creation request when the newly added user resource creation request is monitored, and the target priority label is judged based on the inventory information, and the target priority label is released after the case is created based on the priority label corresponding to the target computing power resource.
The safety water line prediction model comprises a first computing power resource data, a second computing power resource data, a labeling data set, a characteristic value sequence corresponding to each computing power resource and a plurality of multi-element time sequences corresponding to the computing power resource, wherein the first computing power resource data comprises historical resource demand data of a target computing power resource group and first priority label data, the second computing power resource data is subjected to cleaning, filtering and formatting to obtain the labeling data set, the characteristic value sequence corresponding to each computing power resource and the multi-element time sequences corresponding to the computing power resources are determined based on the labeling data set, the characteristic value sequence comprises a demand average value corresponding to the computing power resource and based on a time axis, the characteristic value sequence corresponding to all computing power resources is used as a model input value, and the multi-element time sequences are used as model output values for training to obtain the safety water line prediction model based on a machine learning model.
The method comprises the steps of obtaining a characteristic value sequence corresponding to each computing power resource by extracting key characteristics of labeling data of a high-priority label in the labeling data set, obtaining a characteristic value sequence corresponding to each computing power resource, processing the characteristic value sequence corresponding to each computing power resource through an autoregressive model, and obtaining a prediction sequence corresponding to each computing power resource, wherein the prediction sequence comprises sequences of long-term trend characteristics, periodic characteristics and load mode characteristics based on time axis reaction, and multiple time sequences corresponding to all computing power resources corresponding to the same time point on the time axis are integrated based on the prediction sequences corresponding to the computing power resources respectively.
Further, the step of identifying the target computing power resource group corresponding to the low priority label and performing instance release operation includes the steps of obtaining priority label data corresponding to all target computing power resource groups running, forming all target computing power resource groups corresponding to the low priority label into a to-be-processed set, obtaining current resource running data and preset weight values corresponding to the target computing power resource groups for each target computing power resource group in the to-be-processed set, calculating a resource weight value corresponding to the target computing power resource groups based on the current resource running data and the preset weight values, judging whether the resource weight value is smaller than a first threshold value or not, wherein the first threshold value is a minimum weight value for requiring the target computing power resource groups to reach the resource effective utilization rate, if so, determining the instance corresponding to the target computing power resource groups as a to-be-released instance, arranging the to-be-released instance based on the preset weight values in all to-be-released instances to obtain a target release instance sequence, releasing the instance resources based on the target release instance sequence and the preset number, and updating inventory information based on the released resources.
Further, the step of calculating the resource weighted value corresponding to the target computing power resource group based on the current resource operation data and the preset weighted value includes calculating the resource weighted value corresponding to the target computing power resource group according to the following specified formula:
;
Wherein A represents a resource weighting value corresponding to a target computing power resource group, deltat is the single continuous use duration of the target computing power resource group, and k is idle influence evaluation values corresponding to different application scenes in the current time; the weight corresponding to the ith computing power resource operation index in the current time period is given; The method is characterized in that the method is actual operation data corresponding to the ith computing power resource operation index, and m is the total number of the computing power resource operation indexes.
Further, after the step of updating the inventory information based on the released resources, the method further comprises the steps of judging whether the updated inventory information is lower than or equal to a preset safety water line corresponding threshold value, if yes, determining a resource index to be adjusted based on the updated inventory information, wherein the resource index to be adjusted is an index of insufficient current resources determined based on comparison of the updated inventory information and the inventory safety water line, calculating based on a preset weight value corresponding to the resource index to be adjusted to obtain an updated preset weight value corresponding to each instance to be released, and if not, continuing to execute resources subjected to the instance-based release operation, and responding to a user resource creation request.
Further, the step of responding to the user resource creation request based on the resources after the instance release operation comprises the steps of determining a corresponding configuration parameter of a target node scheduling resource based on the resource demand parameter carried in the user resource creation request and a dynamic resource scheduling algorithm with the maximum resource utilization rate of the resource weight, calling a target computing power resource group operation interface based on the configuration parameter to create a target computing power resource group instance, returning details and access information of the target computing power resource group instance to the user, and updating inventory information.
In a second aspect, the present application also provides a bare metal or container-based computing resource scheduling device, where the device includes a plurality of modules for executing the steps of the bare metal or container-based computing resource scheduling method according to the first aspect, where the plurality of modules includes a data acquisition module, a water line prediction module, a task monitoring module, a request parsing module, an instance release module, and a request response module, where the data acquisition module is configured to acquire first computing resource data; the first computing power resource data comprises real-time resource demand data, real-time resource supply data and first priority label data of a target computing power resource group, a water line prediction module, an example release module, a task monitoring module, a request analysis module and an example release module, wherein the real-time resource demand data, the real-time resource supply data and the first priority label data of the target computing power resource group are used for predicting the first computing power resource data through a pre-trained safe water line prediction model to obtain an inventory safe water line corresponding to a current period, the inventory safe water line is used for representing the minimum value of the example resource demand of the high-priority target computing power resource group, the task monitoring module is used for carrying out system initialization deployment according to the inventory safe water line, starting a monitoring task so as to monitor the use data of the target computing power resource and the system resource in real time and convert the use data into inventory information, the inventory information at least comprises total group quantity and idle group quantity corresponding to the target computing power resource group, the request analysis module is used for analyzing the user resource creation request to obtain the target priority label when a new user resource creation request is monitored, the example release module is used for judging the ratio of the idle group quantity to the total group quantity to the target priority label based on the inventory information to be no greater than a preset safety line corresponding threshold value, and the request response module is used for responding to the user resource creation request based on the resources after the instance release operation.
In a third aspect, the present application also provides a bare metal or container based computing resource scheduling system comprising a processor and a memory storing computer executable instructions executable by the processor, the processor executing the computer executable instructions to implement the method of the first aspect.
In a fourth aspect, the present application also provides a computer readable storage medium storing computer executable instructions which, when invoked and executed by a processor, cause the processor to implement the method of the first aspect.
According to the bare metal or container-based computing resource scheduling method, device and system provided by the application, the real-time running condition of the target computing power resource group is predicted through the pre-trained water line prediction model, the current inventory safety water line is determined, then, based on the dynamic inventory safety water line and the actual running condition of the target computing power resource group instance in the current system, whether a mechanism for automatically releasing the low-priority instance is triggered is judged, so that flexible scheduling of the target computing power resource group instance resource is realized, enough reserved resources are ensured for a critical task, the stability of the critical task is ensured, and the user experience is improved.
Detailed Description
The technical solutions of the present application will be clearly and completely described in connection with the embodiments, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
Aiming at the technical problems that in the prior art, computing resources are not reasonably used and the stability of critical tasks is poor, the embodiment of the application provides a computing resource scheduling method, device and system based on bare metal or a container, which are used for predicting the real-time running condition of a target computing resource group through a pre-trained water line prediction model, determining a current inventory safety water line, then deploying based on the water line, starting a monitoring task, and judging whether an instance needs to be released according to real-time inventory information and a corresponding threshold value of the safety water line when a user request is monitored so as to perform reasonable resource scheduling, ensure the stability of the critical task and improve the service quality and the user experience.
For the convenience of understanding the present embodiment, a method for scheduling computing resources based on bare metal or container disclosed in the present embodiment will be described in detail.
Fig. 1 is a flowchart of a bare metal or container-based computing resource scheduling method according to an embodiment of the present application, the method including the steps of:
step S102, acquiring first computing power resource data, wherein the first computing power resource data comprises real-time resource demand data, real-time resource supply data and first priority label data of a target computing power resource group;
In the embodiment of the application, the target computing power resource group comprises container instance resources or bare metal server instance resources, and the real-time resource demand data of the target computing power resource group comprises the current use condition of the target computing power resource group, specifically comprises the current instance number, and the real-time data of indexes such as CPU (central processing unit) use rate, memory use amount, disk IO (input output), network flow and the like of each instance or the whole. These real-time data are recorded in time series and reflect the current situation of resource usage.
The real-time resource supply data of the target computing power resource group comprises the resource quantity of the target computing power resource group which can be provided by the cloud platform at the current moment. Including the amount of container instance CPU, the amount of container instance memory, the container instance disk capacity, the number of container instance GPU cards, the available number of bare metal instances for each specification. For example, a cloud platform may currently create container instances 16vcpu,64GB memory, 100GB disk, 2 GPU cards, PSTD001 standard bare metal 2.
The target computing power resource group corresponds to first priority label data, wherein the first priority label data comprises priority labels of each resource in the resource group, and the priority labels can be grade labels of the target computing power resource group which are created based on different business scenes or user service priority degrees. In the case of limited resources, the high priority resource group may be given priority to obtain resource allocation to ensure smooth execution of critical tasks. The priority label may also be determined by the cloud platform based on the user class, the resource traffic type. For example, a large important customer, the resources created under the project can be automatically labeled with high priority, some users can label the resources by key business, and the platform classifies the labels.
Step S104, predicting the first power resource data through a pre-trained safe water line prediction model to obtain an inventory safe water line corresponding to the current period, wherein the inventory safe water line is used for representing the minimum value of the example resource demand of the high-priority target power resource group;
The safe water line prediction model is obtained by performing data preprocessing and feature extraction on the basis of second computing power resource data (comprising historical resource demand data of a target computing power resource group and second priority label data) to obtain a feature value sequence of each computing power resource and multiple time sequences corresponding to a plurality of computing power resources, and performing model training on the basis of the feature value sequence of each computing power resource and multiple time sequences corresponding to the plurality of computing power resources.
Step S106, carrying out system initialization deployment according to an inventory security water line, starting a monitoring task to monitor the use data of the target computing power resource and the system resource in real time, and converting the use data into inventory information, wherein the inventory information at least comprises the total group number and the idle group number corresponding to the target computing power resource group;
The inventory information may further include an allocated group number corresponding to the target computing power resource group, a priority distribution corresponding to the allocated group, and user resource demand group data, where the total group number of the target computing power resource group=the allocated group number+the free group number. The priority distribution corresponding to the allocated groups comprises priority labels corresponding to each allocated target computing power resource group, and the priority labels can be divided into two types according to the priority, wherein one type is an allocated target computing power resource group with high priority, and the other type is an allocated target computing power resource group with low priority.
In the embodiment of the application, the use data of each container or bare metal instance resource and system resource is monitored in real time based on the monitoring task, and is converted and updated into the inventory information, so that the subsequent opportunity for deciding to release the low-priority target computing power resource group according to the inventory information is facilitated, and the requirement of the high-priority target computing power resource group is ensured.
Step S108, when a newly added user resource creation request is monitored, analyzing the user resource creation request to obtain a target priority label;
step S110, if the target priority label is a high priority label, and the ratio of the number of idle groups to the total number of groups is judged based on inventory information and is not greater than a preset safety water line corresponding threshold, traversing all priority labels corresponding to each running target computing power resource group, identifying the target computing power resource group corresponding to the low priority label and performing instance release operation;
the corresponding threshold value of the preset safe water line is a preset value. In the initial stage of system deployment, operation and maintenance personnel complete a series of basic configuration through a command line, wherein the configuration of the inventory safety water line is based on the comprehensive evaluation of historical resource demand data of all containers or bare metal instances, and the inventory minimum safety water line is set. This threshold is used to determine when to trigger a mechanism to automatically release low priority instances to ensure that there is sufficient resource reserve when high priority needs come.
In this embodiment, the predetermined safe water line corresponding threshold is a predetermined percentage value to ensure that there is enough resource reserve when the high priority demand comes.
The judgment logic is that if the idle group number of the target computing power resource group/the total group number of the target computing power resource group > = the preset safety water line corresponding threshold value, the target computing power resource can be directly established if the idle group number of the target computing power resource group/the total group number of the target computing power resource group > = the preset safety water line corresponding threshold value is higher than the safety water line.
If the number of idle groups of the target computing power resource group/the total number of groups of the target computing power resource group is less than the preset safety water line corresponding threshold, the safety water line corresponding threshold is lower than the safety water line, and the low-priority resources are automatically released while the target computing power is created according to the distribution of the target computing power resources of each priority.
In other embodiments, if the target priority label is a non-high priority label, acquiring the creation request time, waiting response time, the number of existing resources and duration of use in a preset time of the user, judging whether the number of creation requests or the waiting response time is larger than a first preset parameter based on the number of creation requests or the waiting response time, wherein the first preset parameter is a minimum value of the number of requests or a maximum value of the waiting response time corresponding to the high priority for adjusting the target priority label, if the number of creation requests is larger than the first preset parameter, converting the priority label of the current creation request into the high priority label until the current task is executed, and then converting the current creation request label back to the original low priority label until the current task is executed, if the number of creation requests is smaller than or equal to the first preset parameter, judging whether the number of the existing resources and the duration of use are larger than the corresponding second preset parameter based on the number of the existing resources and the duration of use, wherein the duration of use is the minimum value of the existing resources corresponding to the high priority for adjusting the target priority label, and the duration of use time, if the number of the existing resources and the duration of use time are larger than the minimum value of the resources corresponding to the high priority, converting the priority label to the current priority label until the current task is executed. If the number of the existing resources and the duration of continuous use are smaller than or equal to the corresponding second preset parameters, the maintained priority label of the current creation request is a low priority label.
Aiming at a low-priority creation request, in order to avoid the loss of old clients and reduce the waiting time of users, the application comprehensively considers the factors such as the frequency of initiating the request, the waiting response time, the quantity of existing resources, the duration of continuous use and the like in the preset time, dynamically adjusts the priority label of the creation request, avoids neglecting the response time of a low-priority task to ensure the stability of a key task and the waiting time of the old users, temporarily changes the current task creation request into high priority, judges whether the resource release is needed or not based on the quantity of the resources in the user resource creation request and the preset safety water line, and releases the resource based on the running low-priority target computing resource group according to the corresponding strategy of the application if the resource release is needed, so that the waiting time of the users is reduced and the user experience is improved.
Step S112, responding to the user resource creation request based on the resources after the instance release operation.
In the bare metal or container-based computing resource scheduling method provided by the embodiment of the application, the first resource data acquired in real time is input into the safe water line prediction model to predict the corresponding safe water line at the current time, so that the effect of dynamically adjusting the inventory safe water line in real time according to the actual running conditions of the computing power resources and the system resources of the target computing power resource group in the current system is achieved. And then judging whether to trigger a mechanism for automatically releasing the low-priority target computing power resource group based on the dynamic inventory safety water line and the actual running condition of the corresponding resources of the target computing power resource group in the current system, so as to realize flexible scheduling of the computing power resources, ensure enough resource reserve when the high-priority demands arrive in different periods (peak period or non-peak period), improve the resource scheduling efficiency of the resource utilization rate, ensure the stability of the key tasks and improve the user experience.
The training process of the safe water line prediction model is described in detail below:
Referring to fig. 2, the training process specifically includes the following steps:
step S202, acquiring second computing power resource data, wherein the second computing power resource data comprises historical resource demand data of a target computing power resource group and second priority label data;
the historical resource demand data of the target computing power resource group comprises the use condition of the target computing power resource group in the past appointed time period, and comprises the historical data of indexes such as CPU use rate, memory use amount, disk IO, network flow and the like of each instance or the whole. These data are recorded in time series and reflect trends and patterns of resource usage. For example, a cloud platform, how many container instances are at a certain point in time, CPU usage, memory usage at that time, and so on.
The second priority label data corresponding to the target computing power resource group comprises priority labels corresponding to each computing power resource, and the priority labels are used for identifying importance levels of different resource groups or services. In the case of limited resources, the high priority resource group may be given priority to obtain resource allocation to ensure smooth execution of critical tasks. The priority label is determined by the cloud platform based on the user grade and the resource service type. For example, a large important customer, the resources created under the project can be automatically labeled with high priority, some users can label the resources by key business, and the platform classifies the labels.
The second priority tag data may be the same as or different from the first priority tag data, and generally, the second priority tag data is adaptively adjusted according to actual resource usage.
Step S204, cleaning, filtering and formatting the second computing power resource data to obtain a labeling data set;
Step S206, determining a characteristic value sequence corresponding to each computing power resource and a plurality of multi-element time sequences corresponding to a plurality of computing power resources based on the labeling data set, wherein the characteristic value sequence comprises a demand average value corresponding to the computing power resource and based on a time axis, and the specific implementation process of the step is as follows:
(1) And extracting key features of the labeling data of the high-priority labels in the labeling data set to obtain a feature value sequence corresponding to each computing power resource, wherein the feature value sequence is based on a demand average value of a certain resource corresponding to a time axis.
In this embodiment, after normalization processing is performed on the basis of the data types corresponding to the labeling data, the running data after normalization processing is processed by using a moving average algorithm for the same type of resources, so as to obtain an average value sequence corresponding to each resource and based on a time axis. For example, an average time series of CPU utilization and an average time series of memory utilization, etc., from which smooth short-term fluctuations of the various resources, as well as trend characteristics of long-term resource demand, can be seen.
(2) Processing the characteristic value sequence corresponding to each computing power resource through an autoregressive model to obtain a predicted sequence corresponding to each computing power resource, wherein the predicted sequence comprises a sequence based on a time axis and reflecting long-term trend characteristics, periodic characteristics and load mode characteristics;
In this embodiment, the autoregressive model is an ARIMA model, and in other embodiments, the autoregressive model may be a VAR model, an ARMA model, or the like.
(3) And integrating the prediction sequences respectively corresponding to all the computing power resources corresponding to the same time point on the time axis to obtain a plurality of multi-element time sequences corresponding to the computing power resources.
For example, for 10-point integration at the same time node, such as 12/31/2022, characteristic values of a predicted sequence corresponding to resources such as CPU utilization, memory utilization, disk IO and network traffic at the time node are used as column vectors, and are integrated based on a time axis to obtain a multi-element time sequence.
And step S208, based on the machine learning model, training by taking the characteristic value sequences corresponding to all the computational power resources as model input values and taking the multivariate time sequence as model output values to obtain a safe water line prediction model.
In the embodiment of the application, standardized data corresponding to a high-priority label is firstly screened out, then data mining is carried out on the standardized data to extract characteristic values corresponding to the peak demand of computing power resources and system resources of a target computing power resource group in different time periods in a high-priority task, the characteristic values are used for training a prediction model, and a complex mathematical relationship between the characteristic values is learned and simulated by deep learning to obtain a safe water line prediction model.
The implementation process of the step of identifying the target computing power resource group corresponding to the low-priority label and performing the instance release operation is described in detail below, and is shown in fig. 3:
Step S302, acquiring priority label data corresponding to all running target computing power resource groups, and forming the target computing power resource groups corresponding to all low-priority labels into a to-be-processed set;
Step S304, for each target computing power resource group in the set to be processed, acquiring current resource operation data and preset weight value corresponding to the target computing power resource group, calculating a resource weight value corresponding to the target computing power resource group based on the current resource operation data and the preset weight value, judging whether the resource weight value is smaller than a first threshold value, wherein the first threshold value is the minimum weight value required to reach the effective utilization rate of resources for the target computing power resource group;
The current resource operation data at least comprises a CPU utilization rate, a memory utilization rate, a disk IO, a network flow, a service request amount and a time stamp, wherein the preset weight value is a weight value corresponding to each index (the CPU utilization rate, the memory utilization rate, the disk IO, the network flow and the service request amount) in the current resource operation data under different time, for example, the CPU utilization rate of a container example is 80%, the preset weight value corresponding to the current time period is 0.5, the resource weighting value of the CPU utilization rate is 0.4, the memory utilization rate is 90%, the preset weight value corresponding to the current time period is 0.7, the resource weighting value of the memory utilization rate is 0.63, and the like.
The step of calculating the resource weighted value corresponding to the target computing power resource group based on the current resource operation data and the preset weighted value comprises the following steps of calculating the resource weighted value corresponding to the target computing power resource group according to the following specified formula:
;
Wherein A represents a resource weighting value corresponding to a target computing power resource group, deltat is the single continuous use duration of the target computing power resource group, and k is idle influence evaluation values corresponding to different application scenes in the current time; the weight corresponding to the ith computing power resource operation index in the current time period is given; The method is characterized in that the method is actual operation data corresponding to the ith computing power resource operation index, and m is the total number of the computing power resource operation indexes.
Step S306, based on the magnitude of the preset weight value in all the instances to be released, arranging the instances to be released to obtain a target release instance sequence;
step S308, releasing the resources of the target computing power resource group instance based on the target release instance sequence and the preset quantity, and updating the inventory information based on the released resources.
In this embodiment, based on factors of fluctuation of different resource demands in different time states, evaluation of different emphasis dimensions is performed on the computing power resource utilization conditions of the current period of the low-priority target computing power resource group, a computing resource weighting value is correspondingly calculated based on a preset weight dynamically adjusted in different time states, and then an underutilized target computing power resource group is determined based on the resource weighting value and is used as an instance resource of the target computing power resource group to be released. Based on the corresponding dynamic weight of each resource operation index, the change of the demands for different resources under different time states can be known, and the secondary confirmation of resource release is carried out on all the target computing power resource groups to be released through the size of the dynamic preset weight value, so that the condition that the minimum target computing power resource groups are effectively released in the current period is ensured to meet the demands of high-level examples on the example resources of the target computing power resource groups, and the preemption of the resources of the low-level target computing power resource groups is reduced as much as possible.
In another embodiment, after the step of updating the inventory information based on the released resources, the method further includes:
Step S310, judging whether the updated inventory information is lower than or equal to a preset safety water line corresponding threshold value;
step S312, if yes, determining a resource index to be adjusted based on the updated inventory information, wherein the resource index to be adjusted is an index of insufficient current resources determined based on the comparison of the updated inventory information and the inventory safety water line;
If not, continuing to execute step S112, and responding to the user resource creation request based on the resource after the instance release operation.
Further, the step of responding to the user resource creation request based on the resource after the instance release operation includes:
(1) Determining a corresponding configuration parameter of a target node scheduling resource based on a resource demand parameter carried in a user resource creation request and a dynamic resource scheduling algorithm with the maximum resource utilization rate of a resource weight, calling a target computing power resource group operation interface based on the configuration parameter, and creating a target computing power resource group instance;
In this embodiment, based on specific resource requirements such as CPU, memory, disk size, etc. in the user creation request, relevant parameters of the running resources of the cloud platform, and relevant parameters corresponding to the security water line, the relevant parameters are carried into a dynamic resource scheduling algorithm with the maximum resource utilization rate of the resource weight for processing, so as to obtain the configuration parameters corresponding to the scheduling resources of the target node. The dynamic resource scheduling algorithm based on the maximum resource utilization rate of the resource weight adopts a Filter Scheduler algorithm. In other embodiments, load Balancing algorithms, priority Scheduling algorithms, etc. are also possible.
(2) And returning the details of the target computing power resource group instance and the access information to the user, and simultaneously updating the inventory information.
In this embodiment, the load of the container or the bare metal server instance is predicted based on the high-priority label, and whether the container or the bare metal instance needs to be stretched or not is determined based on the dynamically adjusted inventory safety water line on the premise of ensuring high-quality service, so that the stretching time is ensured not to be too early to waste resources, and service quality is not reduced too late. And then guiding the scheduling of the container or the bare metal based on the combination of the container or the bare metal load prediction result and the dynamic resource scheduling algorithm, and improving the utilization rate of resources.
In the bare metal or container-based computing resource scheduling method provided by the embodiment of the application, firstly, time sequence feature extraction is carried out on historical resource demand data of a target computing power resource group based on a high-priority label, and the corresponding feature values of resource demand in different periods are trained by a machine model, and a complex mathematical relationship between the feature values is learned and simulated by deep learning, so that a safe water line prediction model is obtained. And then, based on the real-time collected operation data of the target computing power resource group, predicting by using a safe water line prediction model, thereby achieving the effect of dynamically adjusting the inventory safe water line in real time according to the actual operation condition of the target computing power resource group in the current system. And then based on the dynamic stock safety water line and the actual running condition of the target computing power resource group in the current system, judging whether to trigger a mechanism for automatically releasing the low-priority instance, realizing flexible scheduling of the target computing power resource group, ensuring that the expansion time is not too early to waste resources and too late to cause the reduction of service quality, simultaneously ensuring enough resource reserve when the high-priority demands arrive in different time periods (peak time or off-peak time), ensuring the stability of a key task and improving the user experience. And finally, guiding the target computing power resource group to carry out resource scheduling by combining with a dynamic resource scheduling algorithm, and improving the utilization rate of resources.
Further, based on factors of fluctuation of different resource demands in different time states, evaluation of different emphasis dimensionalities is carried out on resource utilization conditions of a current period of a low-priority target computing power resource group, a resource weighting value is calculated correspondingly based on preset weights which are dynamically adjusted in different time states, and the underutilized target computing power resource group is determined based on the resource weighting value and is used as an instance to be released. Based on the corresponding dynamic weight of each resource operation index, the change of the demands for different resources in different time states can be known, and the secondary confirmation of the resource release is carried out on all the instances to be released through the size of the dynamic preset weight value, so that the condition that the least instance resources are effectively released in the current time period to meet the demands of the high-level instance on the resources is ensured, and the preemption of the low-level instance resources is reduced as much as possible.
Based on the above method embodiment, the embodiment of the present application further provides a bare metal or container based computing resource scheduling apparatus, referring to fig. 4, which includes a plurality of modules for performing the steps of the bare metal or container based computing resource scheduling method described in the above embodiment, where the plurality of modules includes a data acquisition module 402, a water line prediction module 404, a task monitoring module 406, a request parsing module 408, an instance release module 410, and a request response module 412, where:
The system comprises a data acquisition module 402, a water line prediction module 404, a task monitoring module 406, a task analysis module 408, and a case release module 410, wherein the data acquisition module 402 is used for acquiring first computing power resource data, the first computing power resource data comprises real-time resource demand data, real-time resource supply data and first priority tag data of a target computing power resource group, the water line prediction module 404 is used for predicting the first computing power resource data through a pre-trained water line prediction model to obtain an inventory security water line corresponding to a current period, the inventory security water line is used for representing the minimum value of the case resource demand of the high priority target computing power resource group, the task monitoring module 406 is used for carrying out system initialization deployment according to the inventory security water line, starting a monitoring task to monitor the use data of the target computing power resource and the system resource in real time, and converting the use data into stock information, the stock information at least comprises the total group number and the free group number corresponding to the target computing power resource group, the request analysis module 408 is used for obtaining a target priority tag when the newly increased user resource creation request is monitored, the case release module 410 is used for obtaining the target priority tag when the target priority tag is detected, and the case release module is used for judging that the ratio of the free group number to the case resource is the high priority tag based on the inventory security water line, the ratio of the case resource demand is not larger than the preset water level, the total number of the preset group is corresponding to the priority resource request corresponding to the case release of the target computing resource, and the target computing power is used for releasing the case label response to the request.
The device further comprises a model training module, wherein the model training module is used for executing a training process of the safety water line prediction model, the second calculation power resource data comprise historical resource demand data of a target calculation power resource group and second priority label data, the second calculation power resource data are subjected to cleaning, filtering and formatting to obtain a labeling data set, a characteristic value sequence corresponding to each calculation power resource and a plurality of multi-element time sequences corresponding to a plurality of calculation power resources are determined based on the labeling data set, the characteristic value sequence comprises a demand average value corresponding to the calculation power resource and based on a time axis, the characteristic value sequences corresponding to all calculation power resources are used as model input values based on a machine learning model, and the multi-element time sequences are used as model output values for training, so that the safety water line prediction model is obtained.
The model training module is used for extracting key features of labeling data of high-priority labels in a labeling data set to obtain a feature value sequence corresponding to each computing resource, processing the feature value sequence corresponding to each computing resource through an autoregressive model to obtain a prediction sequence corresponding to each computing resource, wherein the prediction sequence comprises a sequence based on a time axis to reflect long-term trend features, periodic features and load mode features, and a multi-element time sequence corresponding to a plurality of computing resources is obtained by integrating the prediction sequences corresponding to all computing resources corresponding to the same time point on the time axis.
Further, the above-mentioned instance releasing module 410 is configured to obtain priority label data corresponding to all target computing power resource groups being operated, form target computing power resource groups corresponding to all low priority labels into a to-be-processed set, obtain, for each target computing power resource group in the to-be-processed set, current resource operation data and a preset weight value corresponding to the target computing power resource group, calculate a resource weighted value corresponding to the target computing power resource group based on the current resource operation data and the preset weight value, determine whether the resource weighted value is smaller than a first threshold, the first threshold is a minimum weighted value required to reach a resource effective utilization rate for the target computing power resource group, if so, determine an instance corresponding to the target computing power resource group as a to-be-released instance, rank the to-be-released instance based on the preset weight value in all to-be-released instances, obtain a target release instance sequence, release the target computing power resource group instance based on the target release instance sequence and the preset number, and update inventory information based on the released resources.
Further, the above-mentioned example releasing module 410 is configured to calculate the resource weighted value corresponding to the target computing power resource group according to the following specified formula:
;
Wherein A represents a resource weighting value corresponding to a target computing power resource group, deltat is the single continuous use duration of the target computing power resource group, and k is idle influence evaluation values corresponding to different application scenes in the current time; the weight corresponding to the ith computing power resource operation index in the current time period is given; The method is characterized in that the method is actual operation data corresponding to the ith computing power resource operation index, and m is the total number of the computing power resource operation indexes.
Further, the above-mentioned instance releasing module 410 is configured to determine whether the updated inventory information is lower than or equal to a preset safe water line corresponding threshold, if yes, determine a resource index to be adjusted based on the updated inventory information, where the resource index to be adjusted is an index of insufficient current resources determined based on the comparison between the updated inventory information and the inventory safe water line, calculate based on a preset weight value corresponding to the resource index to be adjusted to obtain an updated preset weight value corresponding to each instance to be released, and if no, continue executing the resource after the instance releasing operation, and respond to the user resource creation request.
Further, the request response module 412 is configured to determine a configuration parameter corresponding to the target node scheduling resource based on the resource demand parameter carried in the user resource creation request and the dynamic resource scheduling algorithm with the maximum resource utilization rate of the resource weight, call the target computing power resource group operation interface based on the configuration parameter, create a target computing power resource group instance, return details and access information of the target computing power resource group instance to the user, and update inventory information.
The device provided by the embodiment of the present application has the same implementation principle and technical effects as those of the foregoing method embodiment, and for the sake of brief description, reference may be made to the corresponding content in the foregoing method embodiment where the device embodiment is not mentioned.
The embodiment of the application also provides a bare metal or container-based computing resource scheduling system, as shown in fig. 5, which is a schematic structural diagram of the system, wherein the system comprises a processor 51 and a memory 50, the memory 50 stores computer executable instructions capable of being executed by the processor 51, and the processor 51 executes the computer executable instructions to implement the method.
In the embodiment shown in fig. 5, the system further comprises a bus 52 and a communication interface 53, wherein the processor 51, the communication interface 53 and the memory 50 are connected by the bus 52.
The memory 50 may include a high-speed random access memory (RAM, random Access Memory), and may further include a non-volatile memory (non-volatile memory), such as at least one disk memory. The communication connection between the system network element and at least one other network element is achieved via at least one communication interface 53 (which may be wired or wireless), and the internet, wide area network, local network, metropolitan area network, etc. may be used. Bus 52 may be an ISA (Industry Standard Architecture ) bus, a PCI (PERIPHERAL COMPONENT INTERCONNECT, peripheral component interconnect standard) bus, or EISA (Extended Industry Standard Architecture ) bus, among others. The bus 52 may be classified as an address bus, a data bus, a control bus, or the like. For ease of illustration, only one bi-directional arrow is shown in FIG. 5, but not only one bus or type of bus.
The processor 51 may be an integrated circuit chip with signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in the processor 51 or by instructions in the form of software. The Processor 51 may be a general-purpose Processor, including a central processing unit (Central Processing Unit, CPU), a network Processor (Network Processor, NP), a digital signal Processor (DIGITAL SIGNAL Processor, DSP), an Application Specific Integrated Circuit (ASIC), a Field-Programmable gate array (FPGA) or other Programmable logic device, discrete gate or transistor logic device, or discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be embodied directly in the execution of a hardware decoding processor, or in the execution of a combination of hardware and software modules in a decoding processor. The software modules may be located in a random access memory, flash memory, read only memory, programmable read only memory, or electrically erasable programmable memory, registers, etc. as well known in the art. The storage medium is located in a memory and the processor 51 reads information in the memory and in combination with its hardware performs the steps of the method of the previous embodiment.
The embodiment of the application also provides a computer readable storage medium, which stores computer executable instructions that, when being called and executed by a processor, cause the processor to implement the above method, and the specific implementation can refer to the foregoing method embodiment and will not be described herein.
The computer program product of the method, the apparatus and the system provided in the embodiments of the present application includes a computer readable storage medium storing a program code, where the program code includes instructions for executing the method described in the foregoing method embodiment, and specific implementation may refer to the method embodiment and will not be described herein.
The relative steps, numerical expressions and numerical values of the components and steps set forth in these embodiments do not limit the scope of the present application unless it is specifically stated otherwise.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer readable storage medium executable by a processor. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application. The storage medium includes a U disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, an optical disk, or other various media capable of storing program codes.
In the description of the present application, it should be noted that the directions or positional relationships indicated by the terms "center", "upper", "lower", "left", "right", "vertical", "horizontal", "inner", "outer", etc. are based on the directions or positional relationships shown in the drawings, are merely for convenience of describing the present application and simplifying the description, and do not indicate or imply that the devices or elements referred to must have a specific orientation, be configured and operated in a specific orientation, and thus should not be construed as limiting the present application. Furthermore, the terms "first," "second," and "third" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.
It should be noted that the foregoing embodiments are merely illustrative embodiments of the present application, and not restrictive, and the scope of the application is not limited to the embodiments, and although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those skilled in the art that any modification, variation or substitution of some of the technical features of the embodiments described in the foregoing embodiments may be easily contemplated within the scope of the present application, and the spirit and scope of the technical solutions of the embodiments do not depart from the spirit and scope of the embodiments of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.