Detailed Description
Embodiments of the present application are described below with reference to the accompanying drawings.
For the service using the computing power resource, the current scheme is to provide a unified task layer, a plurality of service instances are placed in a unified task layer queue to wait for scheduling, and the scheduling strategy is basically that the priority of the service is obtained first or configured, and the computing power resource is preempted according to the priority.
Taking service A, B as an example, service a has examples a1, a2, a3, etc., service B has examples B1, B2, B3, etc., if an example task corresponding to an example of service A, B waits for scheduling in a task layer queue, when the resource unit of service a displays available computing resources, if the priority of service B is higher than that of service a, the available computing resources corresponding to service a can be preempted by the example of service B to operate, which results in the situation that service a has resource unit but no resources, resulting in that service a cannot operate and influence user experience.
In order to solve the technical problems, the embodiment of the application provides a service configuration container (such as a naming space, a file system, a resource view and the like) for isolating services, and when an instance is submitted, resource preemption is carried out only according to whether available resources exist in a container corresponding to the service to which the instance belongs, so that the resource preemption is constrained in a queue with service granularity, and the resource preemption among different services is avoided.
It should be noted that, in the method for allocating the computing power resources provided by the embodiment of the application, the service which cannot be isolated by the container, when an instance is submitted, the resource preemption is performed only according to whether the container corresponding to the service to which the instance belongs has available computing power resources, so that the computing power resource preemption is constrained in the queue with the granularity of the service, and the resource preemption among different services is avoided.
The method provided by the embodiment of the application relates to the field of cloud technology, such as cloud computing, wherein cloud computing (cloud computing) refers to a delivery and use mode of an internet technology (Internet Technology, IT) infrastructure, refers to a mode of acquiring required resources in an on-demand and easily-expandable manner through a network, and generalized cloud computing refers to a mode of delivering and using services, refers to a mode of acquiring required services in an on-demand and easily-expandable manner through a network. Such services may be IT, software, internet related, or other services. Cloud Computing is a product of fusion of traditional computer and network technology developments such as Grid Computing (Grid Computing), distributed Computing (DistributedComputing), parallel Computing (Parallel Computing), utility Computing (Utility Computing), network storage (Network Storage Technologies), virtualization (Virtualization), load balancing (Load Balance), and the like.
The method provided by the embodiment of the application also relates to the field of artificial intelligence (ARTIFICIAL INTELLIGENCE, AI), which is an integrated technology of computer science, and aims to understand the essence of intelligence and produce a new intelligent machine capable of reacting in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision.
The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning, automatic driving and other directions.
With the development of artificial intelligence, AI training is required to be run under more and more scenes to train an AI model, and computing resources, such as GPU computing resources, are required to be delivered when the AI training is run, so that the AI training of the same service is put into the same container through the method provided by the embodiment of the application, and isolation among different services is realized, so that after an AI training submitting instance is run, resource preemption among different services is avoided.
The method provided by the embodiment of the application can be applied to data processing equipment, and the data processing equipment can be a server or terminal equipment. The server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud computing service. The terminal device may be, but is not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, etc. The terminal device and the server may be directly or indirectly connected through wired or wireless communication, and the present application is not limited herein.
Next, a system architecture of the computing power resource allocation method will be described taking an example in which the data processing apparatus is a server. Referring to fig. 1, fig. 1 is a schematic system architecture diagram of a computing power resource allocation method according to an embodiment of the present application. The system architecture comprises a server 101, wherein a power computing platform can be deployed on the server 101 and provides services for various services.
The computing platform needs to deliver computing resources, such as GPU computing resources, when the service business runs AI training, and configures corresponding containers for different businesses in order to ensure isolation among the businesses, and each container configures corresponding computing resource limit.
The container is a virtualization technology in a computer operating system, and the technology enables a process to run in a relatively independent and isolated environment, so that the deployment flow of software can be simplified, the portability and the safety of the software are enhanced, and the utilization rate of computing power resources is improved. Container technology is widely used in the field of cloud computing for server scenarios. In this embodiment, the container may be an independent file system, a Namespace (Namespace), or a resource view, and the present application is mainly described using the container as a Namespace.
The container can have the management and control capability of business computing resources, namely, in a single container, quota is made for the container, corresponding computing resource quota is configured for the container, the multi-instance operation quality is ensured, and the container quota is called as the container quota. If the container is a namespace, it becomes a namespace quota (quota). The data processing device, for example, a server may be provided with a GPU card, a central processing unit (Central Processing Units, abbreviated as CPU), a memory, a disk space, and the like, and the name space may be obtained by dividing the GPU card, the CPU, the memory, and the disk space.
In the AI training operation, a service is further divided into a plurality of instances, the plurality of instances share computing resources in a container corresponding to the service, when a certain instance of a target service, such as a first instance, is submitted to operation, the server 101 can convert the acquired first instance into an instance task, the server 101 determines a target container corresponding to the target service according to a corresponding relationship between the service and the container, so as to determine whether available computing resources exist in the target container (i.e. whether the computing resources are full) according to the computing resources corresponding to the target container, and if the available computing resources exist in the target container according to the computing resources corresponding to the target container, execute the instance task corresponding to the first instance. If it is determined that no available computing power resource exists in the target container according to the computing power resource limit corresponding to the target container, the server 101 controls the instance task corresponding to the first instance to enter a task queue of the target container to wait for running. When it is monitored that available computing power resources are present in the target container, the server 101 schedules the target instance task from the task queue for execution.
The target instance may be any instance task in the task queue, may be determined based on priority, may be confirmed according to a first-in first-out principle, and the like, and in some cases, the target instance task may be an instance task corresponding to the first instance, and the like.
Next, taking a container as an example of a namespace, a detailed description will be given of a computing resource allocation method provided by an embodiment of the present application with reference to the accompanying drawings.
Referring to fig. 2, fig. 2 shows a flow chart of a method of computing power resource allocation, the method comprising:
s201, the data processing equipment converts the acquired first instance into an instance task, wherein the first instance belongs to a target service.
In this embodiment, corresponding containers may be configured for different services, where each container is configured with a corresponding computing power resource quota. In some cases, the computing power resource amount may also be configured for the container according to the instances in the container, that is, each instance is configured with a corresponding computing power resource amount, where the computing power resource amount of the container is the sum of computing power resource amounts of the instances in the container.
The computing power resource amount corresponding to each container and/or the computing power resource amount of the instance may be recorded in a Database (DB), as shown in fig. 3. With the AI training, the computing power resources may be gradually occupied, and the occupied computing power resources or available computing power resources may be recorded in the database at any time, so that whether the computing power resource amount is full or not can be conveniently confirmed later, that is, whether available computing power resources exist in the container or not. Of course, it may be recorded on a data processing apparatus that performs the computing power resource allocation method, which is not limited in this embodiment.
If one data processing device sets the amount of GPU cards and the corresponding CPU, memory and disk space, when dividing the container (for example, the name space), the number of GPU cards may be divided equally, and the CPU, memory and disk space corresponding to each GPU card may be obtained equally according to the number of GPU cards. For example, one GPU device is an 8-card GPU, a 96-core CPU, a 512G memory, and a 4T disk space, and then the GPU device is divided into 8 parts according to the GPU card, and each part of CPU, memory, and disk space is respectively 12 cores, 64G, and 500G.
Referring to fig. 3, in fig. 3, taking an example that the data processing device is a GPU device and the container is a namespace, if the namespaces corresponding to the service include namespaces a-D, each namespace performs a namespace quota, and the configured computing resource quota of the namespaces may be represented by a quota card, for example, in fig. 3, the quota of the namespaces a is an M card, the quota of the namespaces B is an N card, the quota of the namespaces C is a K card, and the quota of the namespaces D is an L card. Each instance of the corresponding service may be run in each namespace, each instance configuring a computing resource quota, e.g., the instance running on namespace a includes instance A1, A2, the quota of instance A1 is A1 card, the quota of instance A2 is A2 card, the instance running on namespace B includes instance B1, B2, the quota of instance B1 is a B1 card, the instance running on namespace C includes instance C1, C2., the quota of instance C1 is C1 card, the quota of instance C2 is C2 card, the instance running on namespace D includes instance D1, D2., the quota of instance D1 is D1 card, and the quota of instance D2 is D2 card.
When a first instance of the target service is submitted to run, the data processing apparatus may convert the first instance into an instance task (task) for use in scheduling, running. For example, as shown in fig. 4, if the first instance is denoted by A1, after the first instance is acquired, the first instance may be converted into an instance task, and the instance task corresponding to the first instance may be denoted by an instance task A1.
S202, the data processing equipment determines a target container corresponding to the target service according to the corresponding relation between the service and the container.
S203, if the data processing equipment determines that no available computing power resource exists in the target container according to the computing power resource limit corresponding to the target container, controlling the instance task corresponding to the first instance to enter a task queue of the target container.
Since different services are configured with corresponding containers, the data processing apparatus may determine the corresponding target container according to the target service to which the first instance belongs, so as to know whether its computing power resource quota can satisfy the operation of the service instance. In one possible implementation manner, the first instance acquired by the data processing device may have a corresponding service identifier, so as to determine, according to the service identifier and the correspondence, a target container corresponding to the target service.
The computing power resource amount corresponding to the target container can be read from the database by the data processing device, so that whether available computing power resources exist in the target container can be conveniently determined, the occupied computing power resource amount can be read from the database, and whether the computing power resource amount corresponding to the target container is full or not can be determined.
If the data processing apparatus determines that there are available computing resources in the target container, then an instance task corresponding to the first instance is executed, for example, as shown in fig. 4, and an instance task A1 corresponding to the first instance is input to a scheduler (scheduler) to run an instance task (task-running). And if the data processing equipment determines that the available computing power resources are not available in the target container, controlling the instance task corresponding to the first instance to enter a task queue of the target container for waiting.
Other examples of submitting operations, such as example A2, example A3, example A4, are also included in fig. 4, and the corresponding example tasks are example task A2, example task A3, example task A4, and according to the steps of S201-S203, it may be determined whether example task A2, example task A3, example task A4 are operating or entering a task queue. In fig. 4, taking the example task A1 directly running, the example task A2, the example task A3, and the example task A4 enter a task queue to wait, when the example A1 has a new task running, if it is determined that the computing power resource amount is full again, the example task A1 corresponding to the example A1 also enters the task queue to wait.
In some cases, the computing power resource amount of each container is configured according to the instance, the computing power resource amount of the target container is a sum of computing power resource amounts of instances in the target container, if it is determined that there are available computing power resources in the target container, but since each instance is configured with a corresponding computing power resource amount, the available computing power resources are not necessarily computing power resources for running the first instance, in which case the data processing apparatus may further determine whether the computing power resource amount of the first instance is not full, if it is determined that the computing power resource amount of the first instance is full, the step of S203 is performed.
S204, when the data processing equipment monitors that available computing power resources exist in the target container, scheduling a target instance task from the task queue to be put into operation.
The data processing device can monitor whether available computing power resources appear in the target container or not at regular time, and when the available computing power resources appear, namely the computing power resource limit of the target container appears free, the target instance task is screened from the task queue to be put into operation. The task queues are polled periodically, for example, by a scheduler, and when available computing resources occur, the target instance task is screened out for operation by the scheduler. For example, as shown in fig. 4, if the instance task Ax is determined to be the target instance task according to the priority, the scheduler preferably selects the instance task Ax to be put into operation.
It should be noted that, the waiting example task in the task queue may include a plurality of example tasks, that is, may include other example tasks in addition to the example task corresponding to the first example. The scheduler may screen which instance task is put into operation from the task queue according to different principles, such as a priority preemption principle, a first-in first-out principle (i.e. an instance task that is first entered into the task queue, and is scheduled to be put into operation by the scheduler preferentially). In one possible implementation manner, taking an example that a task queue includes an instance task corresponding to a first instance and an instance task corresponding to a second instance, where the second instance also belongs to a target service, an implementation manner of S204 may be to obtain a priority of the first instance and a priority of the second instance, and if it is determined that the priority of the first instance is higher than the priority of the second instance, schedule the instance task corresponding to the first instance as the target instance task, and put the instance task corresponding to the first instance into operation from the task queue. That is, the instance task corresponding to the first instance with higher priority can preempt the resource of the computing power, and the preemption of the operator resource occurs in the target container without affecting other services.
In one possible implementation manner, the computing power resource limit of each container is configured according to an instance, that is, each instance is configured with a corresponding computing power resource limit, if the data processing device monitors that an available computing power resource appears in the target container, it can be further identified which instance corresponds to the available computing power resource, if it is determined that the available computing power resource is determined according to the computing power resource limit of the first instance, that is, the available computing power resource is the computing power resource of the first instance, it is stated that the computing power resource limit of the first instance appears free, and the data processing device uses the instance task corresponding to the first instance as the target instance task, and schedules the instance task corresponding to the first instance from the task queue to operate.
However, the priorities of the plurality of instance tasks included in the task queue may be different, and the higher the priority, the more important the corresponding instance task may be considered, and the more likely the priority operation is needed, so as to better meet the user requirement. Therefore, in this case, even if the available computing resource is the computing resource of a certain instance, the computing resource quota of the instance is illustrated to be free, but since the task queue further includes other instance tasks, the computing resource preemption in the target container needs to be performed according to the priority of the instance task, so that the instance task with higher priority is preferentially operated.
Taking an example that the task queue includes an example task corresponding to the first example and an example task corresponding to the second example, even if the available computing power resource is determined according to the computing power resource amount of the second example, that is, the available computing power resource is the computing power resource of the second example, to indicate that the computing power resource amount of the second example appears free, the data processing device may further acquire the priority of the first example and the priority of the second example, if it is determined that the priority of the first example is higher than the priority of the second example, take the example task corresponding to the first example as a target example task, schedule the example task corresponding to the first example from the task queue, that is, the first example preemptively takes the computing power resource of the second example.
While the computing power resource of the second instance is preempted by the first instance, in the process that the data processing device monitors whether the target container has available computing power resources, when the fact that the target container has available computing power resources is monitored again, the instance task corresponding to the second instance is scheduled to be put into operation from the task queue. When the available computing force resource appears again, the instance task corresponding to the instance is preferentially executed, so that the problem that the instance cannot operate all the time due to the fact that other instances with high priority constantly occupy the computing force resource in resource delivery is avoided, namely, the situation that the instance starves is avoided.
According to the technical scheme, the corresponding containers are configured for different services, so that the different services are isolated through the containers, and the example task corresponding to each service runs in the corresponding container without disturbing other tasks. The containers can have the capability of controlling the computing power resources, namely, each container is configured with a corresponding computing power resource limit, so that the running quality of the multi-instance tasks is ensured. In this way, when the submitted instance is obtained for the target service, taking the first instance as an example, the obtained first instance can be converted into an instance task, and the target container corresponding to the target service is determined according to the corresponding relationship between the service and the container. If the available computing power resources in the target container are determined according to the computing power resource limit corresponding to the target container, the instance task corresponding to the first instance is controlled to enter the task queue of the target container, so that the first instance is operated by utilizing the computing power resources configured by the target container, and the computing power resources of other businesses are not preempted. When the available computing resources exist in the target container, the target instance task is scheduled to be operated from the task queue, namely, the instance task belonging to the same service enters the task queue of the corresponding container to wait for scheduling, once the available computing resources appear in the container, the target instance task is scheduled to be operated from the task queue, the operation of the instance task belonging to the same service is ensured to be isolated in the corresponding container, even if priority preemption appears in scheduling, the operation is only restricted in the container, the problem of computing resource preemption among different services is avoided, and the operation of other services is influenced, and further the user experience is influenced.
In addition, multi-tenant isolation of the service is realized through the container, task scheduling is integrated according to granularity of the service, and operation of the computing platform is facilitated.
As the number of instances of the service in the container increases, the computing power resource limit of the container may not meet the service requirement, so that the container may be expanded. Taking the example that the target container needs to be expanded, the data processing device can receive an expansion application, wherein the expansion application comprises quota expansion capacity which is determined according to the newly added example. And the data processing equipment adjusts the computing power resource limit of the target container according to the quota expansion capacity. The capacity expansion application can be triggered by an operator on the computing platform.
When applying for capacity expansion of the target container, since the part of the computing power resources utilized by the capacity expansion can be obtained from the resource pool, however, the computing power resources in the resource pool are also limited, and the computing power resource limit of the target container cannot be expanded infinitely, a flow chart of a method for expanding the target container can be seen from fig. 5, and the method includes:
S501, the data processing equipment receives a capacity expansion application.
S502, the data processing equipment determines whether the empty resource quantity in the resource pool meets the quota expanded capacity. If not, S503 is executed, and if yes, S504 is executed.
S503, the data processing equipment expands the capacity of the resource pool.
If it is determined that the amount of empty resources in the resource pool does not satisfy the quota expansion capacity, the capacity of the resource pool may be expanded first, and then the expanded resource pool is utilized to adjust the computing power resource quota of the target container according to the quota expansion capacity, that is, S504 is executed to expand the computing power resource quota of the target container.
S504, the data processing equipment adjusts the computing power resource limit of the target container according to the quota expansion capacity.
S505, the data processing equipment adjusts the idle resource quantity of the resource pool according to the quota expansion quantity.
After the computing power resource limit of the target container is adjusted, the amount of idle resources in the resource pool is reduced, so that the amount of idle resources in the resource pool can be updated according to the quota expansion amount, and the step S502 can be executed according to the accurate amount of idle resources when the quota expansion is performed again.
According to the embodiment of the application, the service is isolated through the container, when an instance in a certain service container is increased, an operator of the computing platform can evaluate and enlarge the increment of computing resources (namely quota expansion capacity) of the container, and the problem that the conventional unified task layer is difficult to evaluate the quota expansion capacity is solved.
Next, the computing power resource allocation method provided by the embodiment of the application will be described in connection with an actual application scenario. In the scene that the service operation AI trains and requires GPU computing power resources, in order to avoid computing power resource preemption between services, isolation between services can be carried out for service configuration namespaces, different namespaces are shared by GPU computing power physical resources, but the GPU computing power logic quota of the namespaces is not shared. One service is divided into a plurality of examples in AI training operation, and on the premise that the plurality of examples share GPU computing resources in the same name space, respective computing resource limits are configured according to the examples. Based on this, taking the example that the data processing device is a server, when a certain instance, such as the first instance, is submitted to run, the computing power resource allocation may be implemented based on the method shown in fig. 6, where the method includes:
S601, the server converts the acquired first instance into an instance task.
S602, if the server determines that available computing power resources exist in the target name space, the instance task is operated.
Because the first instance belongs to the target service, in this embodiment, service configuration namespaces are used for isolating services, each service has a corresponding relationship with the namespaces, and the first instance acquired by the server may have a corresponding service identifier, so that a target container corresponding to the target service is determined according to the service identifier and the corresponding relationship.
If the server determines that the computing power resource limit of the target name space is not full, that is, the available computing power resource exists, the instance task corresponding to the first instance can be directly operated.
S603, if the server determines that no available computing power resource exists in the target naming space, controlling an instance task corresponding to the first instance to enter a task queue of the target naming space.
If the server determines that the computing power resource limit of the target name space is full, i.e. no available computing power resource exists, controlling an instance task corresponding to the first instance to enter a task queue for waiting.
S604, the server polls to monitor whether available computing power resources appear in the target naming space.
S605, the server monitors that available computing power resources appear in the target name space, and schedules the target instance task to be put into operation from the task queue.
The waiting example tasks in the task queue may include a plurality of example tasks, that is, may include other example tasks besides the example task corresponding to the first example, where the other example tasks may enter the task queue before the example task corresponding to the first example, or may enter the task queue after the example task corresponding to the first example, which is not limited in this embodiment.
The server may poll periodically to monitor whether available computing resources are present in the target namespace, and if so, schedule instance tasks in the task queue based on a priority preemption principle, thereby restricting priority preemption in the target namespace.
For example, the task queue includes an instance task corresponding to the first instance and an instance task corresponding to the second instance, where the second instance also belongs to the target service, and if the server determines that the priority of the first instance is higher than that of the second instance, the server can preempt the available computing power resource by the instance task corresponding to the first instance, and schedule the instance task corresponding to the first instance from the task queue to operate.
Of course, the example tasks in the task queue may also be scheduled based on a first-in first-out principle. For example, the task queue includes an instance task corresponding to the first instance and an instance task corresponding to the second instance, where the second instance also belongs to the target service, and the instance task corresponding to the second instance is entered into the task queue after the instance task corresponding to the first instance. If the server monitors that the available computing power resource appears in the target naming space, even if the available computing power resource is determined by the computing power resource quota of the second instance, the available computing power resource is the computing power resource of the second instance, and the fact that the computing power resource quota of the second instance appears free is indicated, because the instance task corresponding to the first instance preferentially enters the task queue, the instance task corresponding to the first instance can be preferentially scheduled to be put into operation, and therefore resource preemption is achieved in the target naming space.
According to the embodiment of the application, the corresponding namespaces are configured for different services, so that the different services are isolated through the namespaces, and the example task corresponding to each service runs in the corresponding namespaces without interfering other tasks. The namespaces can have the management and control capability of the computing power resources, namely, corresponding computing power resource limits are configured in each namespace, and the running quality of the multi-instance tasks is guaranteed. Thus, when the submitted first instance is obtained aiming at the target service, the server can convert the obtained first instance into an instance task, and the target naming space corresponding to the target service is determined according to the corresponding relation between the service and the naming space. If it is determined that no available computing power resource exists in the target naming space, controlling an instance task corresponding to the first instance to enter a task queue of the target naming space, so that the first instance can be operated by utilizing the computing power resource configured by the target naming space without preempting the computing power resources of other services. When the available computing power resources exist in the target namespaces, scheduling the target instance tasks from the task queues to be put into operation, ensuring that the operation of the instance tasks belonging to the same service is isolated in the corresponding namespaces, and ensuring that the priority preemption is only restricted in the namespaces even when the priority preemption occurs during scheduling, so that the problem of computing power resource preemption among different services is avoided, the operation of other services is influenced, and further the user experience is influenced.
Based on the computing power resource allocation method provided by the corresponding embodiment of fig. 2, the embodiment of the present application further provides a computing power resource allocation device, containers corresponding to different service configurations, each container configures a corresponding computing power resource amount, and referring to fig. 7, the device 700 includes a conversion unit 701, a determination unit 702, an entry unit 703, and a scheduling unit 704:
The conversion unit 701 is configured to convert the obtained first instance into an instance task, where the first instance belongs to a target service;
The determining unit 702 is configured to determine, according to a correspondence between a service and a container, a target container corresponding to the target service;
The entering unit 703 is configured to control an instance task corresponding to the first instance to enter a task queue of the target container if it is determined that no available computing resource exists in the target container according to the computing resource amount corresponding to the target container;
The scheduling unit 704 is configured to schedule, when it is detected that there are available computing resources in the target container, a target instance task to be put into operation from the task queue. .
In a possible implementation manner, if the task queue further includes an instance task corresponding to a second instance, where the second instance belongs to the target service, the scheduling unit 704 is configured to:
acquiring the priority of the first instance and the priority of the second instance;
If the priority of the first instance is higher than that of the second instance, taking the instance task corresponding to the first instance as the target instance task;
And scheduling the instance task corresponding to the first instance from the task queue to be put into operation.
In a possible implementation manner, the computing power resource amount of each container is configured according to an instance, the computing power resource amount of the target container is a sum of computing power resource amounts of instances in the target container, and the determining unit 702 is further configured to:
If available computing power resources exist in the target container according to the computing power resource amount corresponding to the target container, determining whether the computing power resource amount of the first instance is not full;
and if the determining unit determines that the computing power resource limit of the first instance is full, triggering the entering unit to execute the step of controlling the instance task corresponding to the first instance to enter the task queue of the target container.
In a possible implementation manner, the scheduling unit 704 is configured to:
If the available computing power resource is determined according to the computing power resource limit of the first instance, taking an instance task corresponding to the first instance as the target instance task;
And scheduling the instance task corresponding to the first instance from the task queue to be put into operation.
In a possible implementation manner, if the task queue further includes an instance task corresponding to a second instance, where the second instance belongs to the target service, the scheduling unit 704 is configured to:
acquiring the priority of the first instance and the priority of the second instance;
If the priority of the first instance is higher than that of the second instance, taking the instance task corresponding to the first instance as the target instance task;
And scheduling the instance task corresponding to the first instance from the task queue to be put into operation.
In one possible implementation manner, after the instance task corresponding to the first instance is scheduled to be put into operation from the task queue, the scheduling unit 704 is further configured to:
And when the available computing power resources exist in the target container, scheduling an instance task corresponding to the second instance from the task queue to be put into operation.
In a possible implementation manner, the apparatus further includes a receiving unit and an adjusting unit:
The receiving unit is used for receiving a capacity expansion application, wherein the capacity expansion application comprises quota expansion capacity which is determined according to a new added example;
The adjusting unit is used for adjusting the computing power resource limit of the target container according to the quota expansion capacity.
In a possible implementation manner, the determining unit 702 is further configured to:
Determining whether the empty resource quantity in the resource pool meets the quota expanded capacity;
If yes, triggering the adjusting unit to execute the step of adjusting the computing power resource quota of the target container according to the quota expansion capacity;
the adjusting unit is further configured to adjust an idle resource amount of the resource pool according to the quota expansion amount.
In a possible implementation, the adjusting unit is further configured to:
If the determining unit 702 determines that the amount of empty resources in the resource pool does not meet the quota expansion capacity, expanding the capacity of the resource pool;
and adjusting the computing power resource limit of the target container according to the quota expansion capacity by using the expanded resource pool.
The embodiment of the application also provides equipment for computing power resource allocation, which can be data processing equipment and is used for executing the computing power resource allocation method, wherein the equipment can be terminal equipment, and the terminal equipment is taken as a smart phone for example:
Fig. 8 is a block diagram showing a part of a structure of a smart phone related to a terminal device provided by an embodiment of the present application. Referring to fig. 8, the smart phone includes a Radio Frequency (RF) circuit 810, a memory 820, an input unit 830, a display unit 840, a sensor 850, an audio circuit 860, a wireless fidelity (WiFi) module 870, a processor 880, and a power supply 890. The input unit 830 may include a touch panel 831 and other input devices 832, the display unit 840 may include a display panel 841, and the audio circuit 860 may include a speaker 861 and a microphone 862. Those skilled in the art will appreciate that the smartphone structure shown in fig. 8 is not limiting of the smartphone and may include more or fewer components than shown, or may combine certain components, or a different arrangement of components.
The memory 820 may be used to store software programs and modules, and the processor 880 performs various functional applications and data processing of the smart phone by running the software programs and modules stored in the memory 820. The memory 820 may mainly include a storage program area that may store an operating system, an application program required for at least one function (such as a sound playing function, an image playing function, etc.), etc., and a storage data area that may store data created according to the use of the smart phone (such as audio data, a phonebook, etc.), etc. In addition, memory 820 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device.
The processor 880 is a control center of the smart phone, connects various parts of the entire smart phone using various interfaces and lines, performs various functions of the smart phone and processes data by running or executing software programs and/or modules stored in the memory 820, and calling data stored in the memory 820. In the alternative, processor 880 may include one or more processing units, and preferably, processor 880 may integrate an application processor that primarily processes operating systems, user interfaces, application programs, and the like, with a modem processor that primarily processes wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 880.
In this embodiment, the processor 880 in the terminal device 800 may perform the following steps;
converting the acquired first instance into an instance task, wherein the first instance belongs to a target service;
determining a target container corresponding to the target service according to the corresponding relation between the service and the container;
if the available computing power resources in the target container are determined according to the computing power resource limit corresponding to the target container, controlling the instance task corresponding to the first instance to enter a task queue of the target container;
And when the available computing power resources exist in the target container, scheduling the target instance task from the task queue to be put into operation.
The device may further include a server, and as shown in fig. 9, fig. 9 is a block diagram of the server 900 according to the embodiment of the present application, where the server 900 may have a relatively large difference due to different configurations or performances, and may include one or more central processing units (Central Processing Units, abbreviated as CPU) 922 (e.g. one or more processors) and a memory 932, and one or more storage media 930 (e.g. one or more mass storage devices) storing application programs 942 or data 944. Wherein the memory 932 and the storage medium 930 may be transitory or persistent. The program stored in the storage medium 930 may include one or more modules (not shown), each of which may include a series of instruction operations on a server. Still further, the central processor 922 may be arranged to communicate with a storage medium 930 to execute a series of instruction operations in the storage medium 930 on the server 900.
The server 900 may also include one or more power supplies 926, one or more wired or wireless network interfaces 950, one or more input/output interfaces 958, and/or one or more operating systems 941, such as Windows Server, mac OS XTM, unixTM, linuxTM, freeBSDTM, and the like.
In this embodiment, the cpu 922 in the server 900 may perform the following steps;
converting the acquired first instance into an instance task, wherein the first instance belongs to a target service;
determining a target container corresponding to the target service according to the corresponding relation between the service and the container;
if the available computing power resources in the target container are determined according to the computing power resource limit corresponding to the target container, controlling the instance task corresponding to the first instance to enter a task queue of the target container;
And when the available computing power resources exist in the target container, scheduling the target instance task from the task queue to be put into operation.
According to an aspect of the present application, there is provided a computer readable storage medium for storing program code for executing the computing power resource allocation method according to the foregoing embodiments.
According to one aspect of the present application, there is provided a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device performs the methods provided in the various alternative implementations of the above embodiments.
The terms "first," "second," "third," "fourth," and the like in the description of the application and in the above figures, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the application described herein may be implemented, for example, in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
In the several embodiments provided in the present application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be embodied essentially or in part or all of the technical solution or in part in the form of a software product stored in a storage medium, including instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application. The storage medium includes various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RandomAccess Memory RAM), a magnetic disk, or an optical disk.
While the application has been described in detail with reference to the foregoing embodiments, it will be understood by those skilled in the art that the foregoing embodiments may be modified or equivalents may be substituted for some of the features thereof, and that the modifications or substitutions do not depart from the spirit and scope of the embodiments of the application.