Movatterモバイル変換


[0]ホーム

URL:


CN112380020B - A method, device, equipment and storage medium for allocating computing power resources - Google Patents

A method, device, equipment and storage medium for allocating computing power resources
Download PDF

Info

Publication number
CN112380020B
CN112380020BCN202011395313.3ACN202011395313ACN112380020BCN 112380020 BCN112380020 BCN 112380020BCN 202011395313 ACN202011395313 ACN 202011395313ACN 112380020 BCN112380020 BCN 112380020B
Authority
CN
China
Prior art keywords
instance
computing power
task
target
power resource
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011395313.3A
Other languages
Chinese (zh)
Other versions
CN112380020A (en
Inventor
查冲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co LtdfiledCriticalTencent Technology Shenzhen Co Ltd
Priority to CN202011395313.3ApriorityCriticalpatent/CN112380020B/en
Publication of CN112380020ApublicationCriticalpatent/CN112380020A/en
Application grantedgrantedCritical
Publication of CN112380020BpublicationCriticalpatent/CN112380020B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Classifications

Landscapes

Abstract

Translated fromChinese

本申请实施例公开一种算力资源分配方法、装置、设备及存储介质,不同的业务配置对应的容器,每个容器配置对应的算力资源额度在针对目标业务获取到提交的第一实例时,将第一实例转换成实例任务,根据业务与容器的对应关系,确定目标业务对应的目标容器。若根据目标容器对应的算力资源额度确定目标容器中无可用算力资源,控制第一实例对应的实例任务进入目标容器的任务队列。当监测到目标容器中出现可用算力资源时,从任务队列中调度目标实例任务投入运行。由于通过容器将不同业务进行隔离,同一业务的实例共享同一容器的算力资源,即使在调度时出现优先级抢占也仅是约束在容器内,避免不同业务之间出现算力资源抢占的问题,提高用户的体验。

The embodiments of the present application disclose a computing power resource allocation method, apparatus, device and storage medium. Different business configurations correspond to containers. Each container configuration corresponds to a computing power resource quota. When the first instance submitted is obtained for the target business, the first instance is converted into an instance task, and the target container corresponding to the target business is determined according to the correspondence between the business and the container. If it is determined that there are no available computing power resources in the target container according to the computing power resource quota corresponding to the target container, the instance task corresponding to the first instance is controlled to enter the task queue of the target container. When available computing power resources are monitored in the target container, the target instance task is scheduled from the task queue and put into operation. Since different businesses are isolated through containers, instances of the same business share the computing power resources of the same container. Even if priority preemption occurs during scheduling, it is only constrained within the container, avoiding the problem of computing power resource preemption between different businesses and improving the user experience.

Description

Method, device, equipment and storage medium for distributing computing power resources
Technical Field
The present application relates to the field of data processing, and in particular, to a method, an apparatus, a device, and a storage medium for computing resource allocation.
Background
With the rapid development of science and technology, various advanced technologies are continuously emerging. Graphics processors (Graphics Processing Unit, GPUs) are becoming increasingly popular due to their good computing power. GPUs are often used to perform computational processing in a variety of scenarios. Such as for performing artificial intelligence (ARTIFICIAL INTELLIGENCE, AI) model training.
In the AI model training process of different services, the computing power resources need to be shared. For the service using the computing power resource, the current scheme is to provide a unified task layer, a plurality of service instances are placed in a unified task layer queue to wait for scheduling, and the scheduling strategy is basically that the priority of the service is obtained first or configured, and the computing power resource is preempted according to the priority.
However, the method expands the range of priority preemption scheduling among different services, so that the problem of computational power resource preemption among different services occurs, and the user experience is affected.
Disclosure of Invention
In order to solve the technical problems, the application provides a method, a device, equipment and a storage medium for distributing computing resources, which ensure that the operation of instance tasks belonging to the same service is isolated in a corresponding container, and even if priority preemption occurs during scheduling, the priority preemption is only restricted in the container, so that the problem of computing resource preemption among different services is avoided, the operation of other services is influenced, and further the user experience is influenced.
The embodiment of the application discloses the following technical scheme:
In a first aspect, an embodiment of the present application provides a method for allocating computing power resources, where different services configure corresponding containers, and each container configures a corresponding computing power resource quota, where the method includes:
converting the acquired first instance into an instance task, wherein the first instance belongs to a target service;
determining a target container corresponding to the target service according to the corresponding relation between the service and the container;
if the available computing power resources in the target container are determined according to the computing power resource limit corresponding to the target container, controlling the instance task corresponding to the first instance to enter a task queue of the target container;
And when the available computing power resources exist in the target container, scheduling the target instance task from the task queue to be put into operation.
In a second aspect, an embodiment of the present application provides a computing power resource allocation device, where different service configurations correspond to containers, and each container configures a corresponding computing power resource quota, where the device includes a conversion unit, a determination unit, an entry unit, and a scheduling unit:
The conversion unit is used for converting the acquired first instance into an instance task, wherein the first instance belongs to a target service;
The determining unit is used for determining a target container corresponding to the target service according to the corresponding relation between the service and the container;
The entering unit is configured to control an instance task corresponding to the first instance to enter a task queue of the target container if it is determined that no available computing power resource exists in the target container according to the computing power resource limit corresponding to the target container;
And the scheduling unit is used for scheduling the task of the target instance to be put into operation from the task queue when the available computing power resource exists in the target container.
In a third aspect, an embodiment of the present application provides an apparatus for computing force resource allocation, the electronic apparatus comprising a processor and a memory:
the memory is used for storing program codes and transmitting the program codes to the processor;
The processor is configured to perform the method of the first aspect according to instructions in the program code.
In a fourth aspect, embodiments of the present application provide a computer readable storage medium storing program code for performing the method of the first aspect.
According to the technical scheme, the corresponding containers are configured for different services, so that the different services are isolated through the containers, and the example task corresponding to each service runs in the corresponding container without disturbing other tasks. The containers can have the capability of controlling the computing power resources, namely, each container is configured with a corresponding computing power resource limit, so that the running quality of the multi-instance tasks is ensured. In this way, when the submitted instance is obtained for the target service, taking the first instance as an example, the obtained first instance can be converted into an instance task, and the target container corresponding to the target service is determined according to the corresponding relationship between the service and the container. If the available computing power resources in the target container are determined according to the computing power resource limit corresponding to the target container, the instance task corresponding to the first instance is controlled to enter the task queue of the target container, so that the first instance is operated by utilizing the computing power resources configured by the target container, and the computing power resources of other businesses are not preempted. When the available computing resources exist in the target container, the target instance task is scheduled to be operated from the task queue, namely, the instance task belonging to the same service enters the task queue of the corresponding container to wait for scheduling, once the available computing resources appear in the container, the target instance task is scheduled to be operated from the task queue, the operation of the instance task belonging to the same service is ensured to be isolated in the corresponding container, even if priority preemption appears in scheduling, the operation is only restricted in the container, the problem of computing resource preemption among different services is avoided, and the operation of other services is influenced, and further the user experience is influenced.
Drawings
In order to more clearly illustrate the embodiments of the application or the technical solutions of the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the application, and that other drawings may be obtained according to these drawings without inventive faculty for a person skilled in the art.
Fig. 1 is a schematic diagram of a system architecture of a computing power resource allocation method according to an embodiment of the present application;
FIG. 2 is a flowchart of a method for allocating computing power resources according to an embodiment of the present application;
FIG. 3 is a schematic diagram of a system architecture for computing power resource allocation based on namespaces according to an embodiment of the present application;
FIG. 4 is a flow chart of priority preemption in a single namespace in accordance with an embodiment of the present application;
FIG. 5 is a flowchart of a method for expanding a target container according to an embodiment of the present application;
FIG. 6 is a flowchart of a method for allocating computing power resources according to an embodiment of the present application;
FIG. 7 is a block diagram of a computing power resource allocation device according to an embodiment of the present application;
Fig. 8 is a block diagram of a terminal device according to an embodiment of the present application;
Fig. 9 is a block diagram of a server according to an embodiment of the present application.
Detailed Description
Embodiments of the present application are described below with reference to the accompanying drawings.
For the service using the computing power resource, the current scheme is to provide a unified task layer, a plurality of service instances are placed in a unified task layer queue to wait for scheduling, and the scheduling strategy is basically that the priority of the service is obtained first or configured, and the computing power resource is preempted according to the priority.
Taking service A, B as an example, service a has examples a1, a2, a3, etc., service B has examples B1, B2, B3, etc., if an example task corresponding to an example of service A, B waits for scheduling in a task layer queue, when the resource unit of service a displays available computing resources, if the priority of service B is higher than that of service a, the available computing resources corresponding to service a can be preempted by the example of service B to operate, which results in the situation that service a has resource unit but no resources, resulting in that service a cannot operate and influence user experience.
In order to solve the technical problems, the embodiment of the application provides a service configuration container (such as a naming space, a file system, a resource view and the like) for isolating services, and when an instance is submitted, resource preemption is carried out only according to whether available resources exist in a container corresponding to the service to which the instance belongs, so that the resource preemption is constrained in a queue with service granularity, and the resource preemption among different services is avoided.
It should be noted that, in the method for allocating the computing power resources provided by the embodiment of the application, the service which cannot be isolated by the container, when an instance is submitted, the resource preemption is performed only according to whether the container corresponding to the service to which the instance belongs has available computing power resources, so that the computing power resource preemption is constrained in the queue with the granularity of the service, and the resource preemption among different services is avoided.
The method provided by the embodiment of the application relates to the field of cloud technology, such as cloud computing, wherein cloud computing (cloud computing) refers to a delivery and use mode of an internet technology (Internet Technology, IT) infrastructure, refers to a mode of acquiring required resources in an on-demand and easily-expandable manner through a network, and generalized cloud computing refers to a mode of delivering and using services, refers to a mode of acquiring required services in an on-demand and easily-expandable manner through a network. Such services may be IT, software, internet related, or other services. Cloud Computing is a product of fusion of traditional computer and network technology developments such as Grid Computing (Grid Computing), distributed Computing (DistributedComputing), parallel Computing (Parallel Computing), utility Computing (Utility Computing), network storage (Network Storage Technologies), virtualization (Virtualization), load balancing (Load Balance), and the like.
The method provided by the embodiment of the application also relates to the field of artificial intelligence (ARTIFICIAL INTELLIGENCE, AI), which is an integrated technology of computer science, and aims to understand the essence of intelligence and produce a new intelligent machine capable of reacting in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision.
The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning, automatic driving and other directions.
With the development of artificial intelligence, AI training is required to be run under more and more scenes to train an AI model, and computing resources, such as GPU computing resources, are required to be delivered when the AI training is run, so that the AI training of the same service is put into the same container through the method provided by the embodiment of the application, and isolation among different services is realized, so that after an AI training submitting instance is run, resource preemption among different services is avoided.
The method provided by the embodiment of the application can be applied to data processing equipment, and the data processing equipment can be a server or terminal equipment. The server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud computing service. The terminal device may be, but is not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, etc. The terminal device and the server may be directly or indirectly connected through wired or wireless communication, and the present application is not limited herein.
Next, a system architecture of the computing power resource allocation method will be described taking an example in which the data processing apparatus is a server. Referring to fig. 1, fig. 1 is a schematic system architecture diagram of a computing power resource allocation method according to an embodiment of the present application. The system architecture comprises a server 101, wherein a power computing platform can be deployed on the server 101 and provides services for various services.
The computing platform needs to deliver computing resources, such as GPU computing resources, when the service business runs AI training, and configures corresponding containers for different businesses in order to ensure isolation among the businesses, and each container configures corresponding computing resource limit.
The container is a virtualization technology in a computer operating system, and the technology enables a process to run in a relatively independent and isolated environment, so that the deployment flow of software can be simplified, the portability and the safety of the software are enhanced, and the utilization rate of computing power resources is improved. Container technology is widely used in the field of cloud computing for server scenarios. In this embodiment, the container may be an independent file system, a Namespace (Namespace), or a resource view, and the present application is mainly described using the container as a Namespace.
The container can have the management and control capability of business computing resources, namely, in a single container, quota is made for the container, corresponding computing resource quota is configured for the container, the multi-instance operation quality is ensured, and the container quota is called as the container quota. If the container is a namespace, it becomes a namespace quota (quota). The data processing device, for example, a server may be provided with a GPU card, a central processing unit (Central Processing Units, abbreviated as CPU), a memory, a disk space, and the like, and the name space may be obtained by dividing the GPU card, the CPU, the memory, and the disk space.
In the AI training operation, a service is further divided into a plurality of instances, the plurality of instances share computing resources in a container corresponding to the service, when a certain instance of a target service, such as a first instance, is submitted to operation, the server 101 can convert the acquired first instance into an instance task, the server 101 determines a target container corresponding to the target service according to a corresponding relationship between the service and the container, so as to determine whether available computing resources exist in the target container (i.e. whether the computing resources are full) according to the computing resources corresponding to the target container, and if the available computing resources exist in the target container according to the computing resources corresponding to the target container, execute the instance task corresponding to the first instance. If it is determined that no available computing power resource exists in the target container according to the computing power resource limit corresponding to the target container, the server 101 controls the instance task corresponding to the first instance to enter a task queue of the target container to wait for running. When it is monitored that available computing power resources are present in the target container, the server 101 schedules the target instance task from the task queue for execution.
The target instance may be any instance task in the task queue, may be determined based on priority, may be confirmed according to a first-in first-out principle, and the like, and in some cases, the target instance task may be an instance task corresponding to the first instance, and the like.
Next, taking a container as an example of a namespace, a detailed description will be given of a computing resource allocation method provided by an embodiment of the present application with reference to the accompanying drawings.
Referring to fig. 2, fig. 2 shows a flow chart of a method of computing power resource allocation, the method comprising:
s201, the data processing equipment converts the acquired first instance into an instance task, wherein the first instance belongs to a target service.
In this embodiment, corresponding containers may be configured for different services, where each container is configured with a corresponding computing power resource quota. In some cases, the computing power resource amount may also be configured for the container according to the instances in the container, that is, each instance is configured with a corresponding computing power resource amount, where the computing power resource amount of the container is the sum of computing power resource amounts of the instances in the container.
The computing power resource amount corresponding to each container and/or the computing power resource amount of the instance may be recorded in a Database (DB), as shown in fig. 3. With the AI training, the computing power resources may be gradually occupied, and the occupied computing power resources or available computing power resources may be recorded in the database at any time, so that whether the computing power resource amount is full or not can be conveniently confirmed later, that is, whether available computing power resources exist in the container or not. Of course, it may be recorded on a data processing apparatus that performs the computing power resource allocation method, which is not limited in this embodiment.
If one data processing device sets the amount of GPU cards and the corresponding CPU, memory and disk space, when dividing the container (for example, the name space), the number of GPU cards may be divided equally, and the CPU, memory and disk space corresponding to each GPU card may be obtained equally according to the number of GPU cards. For example, one GPU device is an 8-card GPU, a 96-core CPU, a 512G memory, and a 4T disk space, and then the GPU device is divided into 8 parts according to the GPU card, and each part of CPU, memory, and disk space is respectively 12 cores, 64G, and 500G.
Referring to fig. 3, in fig. 3, taking an example that the data processing device is a GPU device and the container is a namespace, if the namespaces corresponding to the service include namespaces a-D, each namespace performs a namespace quota, and the configured computing resource quota of the namespaces may be represented by a quota card, for example, in fig. 3, the quota of the namespaces a is an M card, the quota of the namespaces B is an N card, the quota of the namespaces C is a K card, and the quota of the namespaces D is an L card. Each instance of the corresponding service may be run in each namespace, each instance configuring a computing resource quota, e.g., the instance running on namespace a includes instance A1, A2, the quota of instance A1 is A1 card, the quota of instance A2 is A2 card, the instance running on namespace B includes instance B1, B2, the quota of instance B1 is a B1 card, the instance running on namespace C includes instance C1, C2., the quota of instance C1 is C1 card, the quota of instance C2 is C2 card, the instance running on namespace D includes instance D1, D2., the quota of instance D1 is D1 card, and the quota of instance D2 is D2 card.
When a first instance of the target service is submitted to run, the data processing apparatus may convert the first instance into an instance task (task) for use in scheduling, running. For example, as shown in fig. 4, if the first instance is denoted by A1, after the first instance is acquired, the first instance may be converted into an instance task, and the instance task corresponding to the first instance may be denoted by an instance task A1.
S202, the data processing equipment determines a target container corresponding to the target service according to the corresponding relation between the service and the container.
S203, if the data processing equipment determines that no available computing power resource exists in the target container according to the computing power resource limit corresponding to the target container, controlling the instance task corresponding to the first instance to enter a task queue of the target container.
Since different services are configured with corresponding containers, the data processing apparatus may determine the corresponding target container according to the target service to which the first instance belongs, so as to know whether its computing power resource quota can satisfy the operation of the service instance. In one possible implementation manner, the first instance acquired by the data processing device may have a corresponding service identifier, so as to determine, according to the service identifier and the correspondence, a target container corresponding to the target service.
The computing power resource amount corresponding to the target container can be read from the database by the data processing device, so that whether available computing power resources exist in the target container can be conveniently determined, the occupied computing power resource amount can be read from the database, and whether the computing power resource amount corresponding to the target container is full or not can be determined.
If the data processing apparatus determines that there are available computing resources in the target container, then an instance task corresponding to the first instance is executed, for example, as shown in fig. 4, and an instance task A1 corresponding to the first instance is input to a scheduler (scheduler) to run an instance task (task-running). And if the data processing equipment determines that the available computing power resources are not available in the target container, controlling the instance task corresponding to the first instance to enter a task queue of the target container for waiting.
Other examples of submitting operations, such as example A2, example A3, example A4, are also included in fig. 4, and the corresponding example tasks are example task A2, example task A3, example task A4, and according to the steps of S201-S203, it may be determined whether example task A2, example task A3, example task A4 are operating or entering a task queue. In fig. 4, taking the example task A1 directly running, the example task A2, the example task A3, and the example task A4 enter a task queue to wait, when the example A1 has a new task running, if it is determined that the computing power resource amount is full again, the example task A1 corresponding to the example A1 also enters the task queue to wait.
In some cases, the computing power resource amount of each container is configured according to the instance, the computing power resource amount of the target container is a sum of computing power resource amounts of instances in the target container, if it is determined that there are available computing power resources in the target container, but since each instance is configured with a corresponding computing power resource amount, the available computing power resources are not necessarily computing power resources for running the first instance, in which case the data processing apparatus may further determine whether the computing power resource amount of the first instance is not full, if it is determined that the computing power resource amount of the first instance is full, the step of S203 is performed.
S204, when the data processing equipment monitors that available computing power resources exist in the target container, scheduling a target instance task from the task queue to be put into operation.
The data processing device can monitor whether available computing power resources appear in the target container or not at regular time, and when the available computing power resources appear, namely the computing power resource limit of the target container appears free, the target instance task is screened from the task queue to be put into operation. The task queues are polled periodically, for example, by a scheduler, and when available computing resources occur, the target instance task is screened out for operation by the scheduler. For example, as shown in fig. 4, if the instance task Ax is determined to be the target instance task according to the priority, the scheduler preferably selects the instance task Ax to be put into operation.
It should be noted that, the waiting example task in the task queue may include a plurality of example tasks, that is, may include other example tasks in addition to the example task corresponding to the first example. The scheduler may screen which instance task is put into operation from the task queue according to different principles, such as a priority preemption principle, a first-in first-out principle (i.e. an instance task that is first entered into the task queue, and is scheduled to be put into operation by the scheduler preferentially). In one possible implementation manner, taking an example that a task queue includes an instance task corresponding to a first instance and an instance task corresponding to a second instance, where the second instance also belongs to a target service, an implementation manner of S204 may be to obtain a priority of the first instance and a priority of the second instance, and if it is determined that the priority of the first instance is higher than the priority of the second instance, schedule the instance task corresponding to the first instance as the target instance task, and put the instance task corresponding to the first instance into operation from the task queue. That is, the instance task corresponding to the first instance with higher priority can preempt the resource of the computing power, and the preemption of the operator resource occurs in the target container without affecting other services.
In one possible implementation manner, the computing power resource limit of each container is configured according to an instance, that is, each instance is configured with a corresponding computing power resource limit, if the data processing device monitors that an available computing power resource appears in the target container, it can be further identified which instance corresponds to the available computing power resource, if it is determined that the available computing power resource is determined according to the computing power resource limit of the first instance, that is, the available computing power resource is the computing power resource of the first instance, it is stated that the computing power resource limit of the first instance appears free, and the data processing device uses the instance task corresponding to the first instance as the target instance task, and schedules the instance task corresponding to the first instance from the task queue to operate.
However, the priorities of the plurality of instance tasks included in the task queue may be different, and the higher the priority, the more important the corresponding instance task may be considered, and the more likely the priority operation is needed, so as to better meet the user requirement. Therefore, in this case, even if the available computing resource is the computing resource of a certain instance, the computing resource quota of the instance is illustrated to be free, but since the task queue further includes other instance tasks, the computing resource preemption in the target container needs to be performed according to the priority of the instance task, so that the instance task with higher priority is preferentially operated.
Taking an example that the task queue includes an example task corresponding to the first example and an example task corresponding to the second example, even if the available computing power resource is determined according to the computing power resource amount of the second example, that is, the available computing power resource is the computing power resource of the second example, to indicate that the computing power resource amount of the second example appears free, the data processing device may further acquire the priority of the first example and the priority of the second example, if it is determined that the priority of the first example is higher than the priority of the second example, take the example task corresponding to the first example as a target example task, schedule the example task corresponding to the first example from the task queue, that is, the first example preemptively takes the computing power resource of the second example.
While the computing power resource of the second instance is preempted by the first instance, in the process that the data processing device monitors whether the target container has available computing power resources, when the fact that the target container has available computing power resources is monitored again, the instance task corresponding to the second instance is scheduled to be put into operation from the task queue. When the available computing force resource appears again, the instance task corresponding to the instance is preferentially executed, so that the problem that the instance cannot operate all the time due to the fact that other instances with high priority constantly occupy the computing force resource in resource delivery is avoided, namely, the situation that the instance starves is avoided.
According to the technical scheme, the corresponding containers are configured for different services, so that the different services are isolated through the containers, and the example task corresponding to each service runs in the corresponding container without disturbing other tasks. The containers can have the capability of controlling the computing power resources, namely, each container is configured with a corresponding computing power resource limit, so that the running quality of the multi-instance tasks is ensured. In this way, when the submitted instance is obtained for the target service, taking the first instance as an example, the obtained first instance can be converted into an instance task, and the target container corresponding to the target service is determined according to the corresponding relationship between the service and the container. If the available computing power resources in the target container are determined according to the computing power resource limit corresponding to the target container, the instance task corresponding to the first instance is controlled to enter the task queue of the target container, so that the first instance is operated by utilizing the computing power resources configured by the target container, and the computing power resources of other businesses are not preempted. When the available computing resources exist in the target container, the target instance task is scheduled to be operated from the task queue, namely, the instance task belonging to the same service enters the task queue of the corresponding container to wait for scheduling, once the available computing resources appear in the container, the target instance task is scheduled to be operated from the task queue, the operation of the instance task belonging to the same service is ensured to be isolated in the corresponding container, even if priority preemption appears in scheduling, the operation is only restricted in the container, the problem of computing resource preemption among different services is avoided, and the operation of other services is influenced, and further the user experience is influenced.
In addition, multi-tenant isolation of the service is realized through the container, task scheduling is integrated according to granularity of the service, and operation of the computing platform is facilitated.
As the number of instances of the service in the container increases, the computing power resource limit of the container may not meet the service requirement, so that the container may be expanded. Taking the example that the target container needs to be expanded, the data processing device can receive an expansion application, wherein the expansion application comprises quota expansion capacity which is determined according to the newly added example. And the data processing equipment adjusts the computing power resource limit of the target container according to the quota expansion capacity. The capacity expansion application can be triggered by an operator on the computing platform.
When applying for capacity expansion of the target container, since the part of the computing power resources utilized by the capacity expansion can be obtained from the resource pool, however, the computing power resources in the resource pool are also limited, and the computing power resource limit of the target container cannot be expanded infinitely, a flow chart of a method for expanding the target container can be seen from fig. 5, and the method includes:
S501, the data processing equipment receives a capacity expansion application.
S502, the data processing equipment determines whether the empty resource quantity in the resource pool meets the quota expanded capacity. If not, S503 is executed, and if yes, S504 is executed.
S503, the data processing equipment expands the capacity of the resource pool.
If it is determined that the amount of empty resources in the resource pool does not satisfy the quota expansion capacity, the capacity of the resource pool may be expanded first, and then the expanded resource pool is utilized to adjust the computing power resource quota of the target container according to the quota expansion capacity, that is, S504 is executed to expand the computing power resource quota of the target container.
S504, the data processing equipment adjusts the computing power resource limit of the target container according to the quota expansion capacity.
S505, the data processing equipment adjusts the idle resource quantity of the resource pool according to the quota expansion quantity.
After the computing power resource limit of the target container is adjusted, the amount of idle resources in the resource pool is reduced, so that the amount of idle resources in the resource pool can be updated according to the quota expansion amount, and the step S502 can be executed according to the accurate amount of idle resources when the quota expansion is performed again.
According to the embodiment of the application, the service is isolated through the container, when an instance in a certain service container is increased, an operator of the computing platform can evaluate and enlarge the increment of computing resources (namely quota expansion capacity) of the container, and the problem that the conventional unified task layer is difficult to evaluate the quota expansion capacity is solved.
Next, the computing power resource allocation method provided by the embodiment of the application will be described in connection with an actual application scenario. In the scene that the service operation AI trains and requires GPU computing power resources, in order to avoid computing power resource preemption between services, isolation between services can be carried out for service configuration namespaces, different namespaces are shared by GPU computing power physical resources, but the GPU computing power logic quota of the namespaces is not shared. One service is divided into a plurality of examples in AI training operation, and on the premise that the plurality of examples share GPU computing resources in the same name space, respective computing resource limits are configured according to the examples. Based on this, taking the example that the data processing device is a server, when a certain instance, such as the first instance, is submitted to run, the computing power resource allocation may be implemented based on the method shown in fig. 6, where the method includes:
S601, the server converts the acquired first instance into an instance task.
S602, if the server determines that available computing power resources exist in the target name space, the instance task is operated.
Because the first instance belongs to the target service, in this embodiment, service configuration namespaces are used for isolating services, each service has a corresponding relationship with the namespaces, and the first instance acquired by the server may have a corresponding service identifier, so that a target container corresponding to the target service is determined according to the service identifier and the corresponding relationship.
If the server determines that the computing power resource limit of the target name space is not full, that is, the available computing power resource exists, the instance task corresponding to the first instance can be directly operated.
S603, if the server determines that no available computing power resource exists in the target naming space, controlling an instance task corresponding to the first instance to enter a task queue of the target naming space.
If the server determines that the computing power resource limit of the target name space is full, i.e. no available computing power resource exists, controlling an instance task corresponding to the first instance to enter a task queue for waiting.
S604, the server polls to monitor whether available computing power resources appear in the target naming space.
S605, the server monitors that available computing power resources appear in the target name space, and schedules the target instance task to be put into operation from the task queue.
The waiting example tasks in the task queue may include a plurality of example tasks, that is, may include other example tasks besides the example task corresponding to the first example, where the other example tasks may enter the task queue before the example task corresponding to the first example, or may enter the task queue after the example task corresponding to the first example, which is not limited in this embodiment.
The server may poll periodically to monitor whether available computing resources are present in the target namespace, and if so, schedule instance tasks in the task queue based on a priority preemption principle, thereby restricting priority preemption in the target namespace.
For example, the task queue includes an instance task corresponding to the first instance and an instance task corresponding to the second instance, where the second instance also belongs to the target service, and if the server determines that the priority of the first instance is higher than that of the second instance, the server can preempt the available computing power resource by the instance task corresponding to the first instance, and schedule the instance task corresponding to the first instance from the task queue to operate.
Of course, the example tasks in the task queue may also be scheduled based on a first-in first-out principle. For example, the task queue includes an instance task corresponding to the first instance and an instance task corresponding to the second instance, where the second instance also belongs to the target service, and the instance task corresponding to the second instance is entered into the task queue after the instance task corresponding to the first instance. If the server monitors that the available computing power resource appears in the target naming space, even if the available computing power resource is determined by the computing power resource quota of the second instance, the available computing power resource is the computing power resource of the second instance, and the fact that the computing power resource quota of the second instance appears free is indicated, because the instance task corresponding to the first instance preferentially enters the task queue, the instance task corresponding to the first instance can be preferentially scheduled to be put into operation, and therefore resource preemption is achieved in the target naming space.
According to the embodiment of the application, the corresponding namespaces are configured for different services, so that the different services are isolated through the namespaces, and the example task corresponding to each service runs in the corresponding namespaces without interfering other tasks. The namespaces can have the management and control capability of the computing power resources, namely, corresponding computing power resource limits are configured in each namespace, and the running quality of the multi-instance tasks is guaranteed. Thus, when the submitted first instance is obtained aiming at the target service, the server can convert the obtained first instance into an instance task, and the target naming space corresponding to the target service is determined according to the corresponding relation between the service and the naming space. If it is determined that no available computing power resource exists in the target naming space, controlling an instance task corresponding to the first instance to enter a task queue of the target naming space, so that the first instance can be operated by utilizing the computing power resource configured by the target naming space without preempting the computing power resources of other services. When the available computing power resources exist in the target namespaces, scheduling the target instance tasks from the task queues to be put into operation, ensuring that the operation of the instance tasks belonging to the same service is isolated in the corresponding namespaces, and ensuring that the priority preemption is only restricted in the namespaces even when the priority preemption occurs during scheduling, so that the problem of computing power resource preemption among different services is avoided, the operation of other services is influenced, and further the user experience is influenced.
Based on the computing power resource allocation method provided by the corresponding embodiment of fig. 2, the embodiment of the present application further provides a computing power resource allocation device, containers corresponding to different service configurations, each container configures a corresponding computing power resource amount, and referring to fig. 7, the device 700 includes a conversion unit 701, a determination unit 702, an entry unit 703, and a scheduling unit 704:
The conversion unit 701 is configured to convert the obtained first instance into an instance task, where the first instance belongs to a target service;
The determining unit 702 is configured to determine, according to a correspondence between a service and a container, a target container corresponding to the target service;
The entering unit 703 is configured to control an instance task corresponding to the first instance to enter a task queue of the target container if it is determined that no available computing resource exists in the target container according to the computing resource amount corresponding to the target container;
The scheduling unit 704 is configured to schedule, when it is detected that there are available computing resources in the target container, a target instance task to be put into operation from the task queue. .
In a possible implementation manner, if the task queue further includes an instance task corresponding to a second instance, where the second instance belongs to the target service, the scheduling unit 704 is configured to:
acquiring the priority of the first instance and the priority of the second instance;
If the priority of the first instance is higher than that of the second instance, taking the instance task corresponding to the first instance as the target instance task;
And scheduling the instance task corresponding to the first instance from the task queue to be put into operation.
In a possible implementation manner, the computing power resource amount of each container is configured according to an instance, the computing power resource amount of the target container is a sum of computing power resource amounts of instances in the target container, and the determining unit 702 is further configured to:
If available computing power resources exist in the target container according to the computing power resource amount corresponding to the target container, determining whether the computing power resource amount of the first instance is not full;
and if the determining unit determines that the computing power resource limit of the first instance is full, triggering the entering unit to execute the step of controlling the instance task corresponding to the first instance to enter the task queue of the target container.
In a possible implementation manner, the scheduling unit 704 is configured to:
If the available computing power resource is determined according to the computing power resource limit of the first instance, taking an instance task corresponding to the first instance as the target instance task;
And scheduling the instance task corresponding to the first instance from the task queue to be put into operation.
In a possible implementation manner, if the task queue further includes an instance task corresponding to a second instance, where the second instance belongs to the target service, the scheduling unit 704 is configured to:
acquiring the priority of the first instance and the priority of the second instance;
If the priority of the first instance is higher than that of the second instance, taking the instance task corresponding to the first instance as the target instance task;
And scheduling the instance task corresponding to the first instance from the task queue to be put into operation.
In one possible implementation manner, after the instance task corresponding to the first instance is scheduled to be put into operation from the task queue, the scheduling unit 704 is further configured to:
And when the available computing power resources exist in the target container, scheduling an instance task corresponding to the second instance from the task queue to be put into operation.
In a possible implementation manner, the apparatus further includes a receiving unit and an adjusting unit:
The receiving unit is used for receiving a capacity expansion application, wherein the capacity expansion application comprises quota expansion capacity which is determined according to a new added example;
The adjusting unit is used for adjusting the computing power resource limit of the target container according to the quota expansion capacity.
In a possible implementation manner, the determining unit 702 is further configured to:
Determining whether the empty resource quantity in the resource pool meets the quota expanded capacity;
If yes, triggering the adjusting unit to execute the step of adjusting the computing power resource quota of the target container according to the quota expansion capacity;
the adjusting unit is further configured to adjust an idle resource amount of the resource pool according to the quota expansion amount.
In a possible implementation, the adjusting unit is further configured to:
If the determining unit 702 determines that the amount of empty resources in the resource pool does not meet the quota expansion capacity, expanding the capacity of the resource pool;
and adjusting the computing power resource limit of the target container according to the quota expansion capacity by using the expanded resource pool.
The embodiment of the application also provides equipment for computing power resource allocation, which can be data processing equipment and is used for executing the computing power resource allocation method, wherein the equipment can be terminal equipment, and the terminal equipment is taken as a smart phone for example:
Fig. 8 is a block diagram showing a part of a structure of a smart phone related to a terminal device provided by an embodiment of the present application. Referring to fig. 8, the smart phone includes a Radio Frequency (RF) circuit 810, a memory 820, an input unit 830, a display unit 840, a sensor 850, an audio circuit 860, a wireless fidelity (WiFi) module 870, a processor 880, and a power supply 890. The input unit 830 may include a touch panel 831 and other input devices 832, the display unit 840 may include a display panel 841, and the audio circuit 860 may include a speaker 861 and a microphone 862. Those skilled in the art will appreciate that the smartphone structure shown in fig. 8 is not limiting of the smartphone and may include more or fewer components than shown, or may combine certain components, or a different arrangement of components.
The memory 820 may be used to store software programs and modules, and the processor 880 performs various functional applications and data processing of the smart phone by running the software programs and modules stored in the memory 820. The memory 820 may mainly include a storage program area that may store an operating system, an application program required for at least one function (such as a sound playing function, an image playing function, etc.), etc., and a storage data area that may store data created according to the use of the smart phone (such as audio data, a phonebook, etc.), etc. In addition, memory 820 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device.
The processor 880 is a control center of the smart phone, connects various parts of the entire smart phone using various interfaces and lines, performs various functions of the smart phone and processes data by running or executing software programs and/or modules stored in the memory 820, and calling data stored in the memory 820. In the alternative, processor 880 may include one or more processing units, and preferably, processor 880 may integrate an application processor that primarily processes operating systems, user interfaces, application programs, and the like, with a modem processor that primarily processes wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 880.
In this embodiment, the processor 880 in the terminal device 800 may perform the following steps;
converting the acquired first instance into an instance task, wherein the first instance belongs to a target service;
determining a target container corresponding to the target service according to the corresponding relation between the service and the container;
if the available computing power resources in the target container are determined according to the computing power resource limit corresponding to the target container, controlling the instance task corresponding to the first instance to enter a task queue of the target container;
And when the available computing power resources exist in the target container, scheduling the target instance task from the task queue to be put into operation.
The device may further include a server, and as shown in fig. 9, fig. 9 is a block diagram of the server 900 according to the embodiment of the present application, where the server 900 may have a relatively large difference due to different configurations or performances, and may include one or more central processing units (Central Processing Units, abbreviated as CPU) 922 (e.g. one or more processors) and a memory 932, and one or more storage media 930 (e.g. one or more mass storage devices) storing application programs 942 or data 944. Wherein the memory 932 and the storage medium 930 may be transitory or persistent. The program stored in the storage medium 930 may include one or more modules (not shown), each of which may include a series of instruction operations on a server. Still further, the central processor 922 may be arranged to communicate with a storage medium 930 to execute a series of instruction operations in the storage medium 930 on the server 900.
The server 900 may also include one or more power supplies 926, one or more wired or wireless network interfaces 950, one or more input/output interfaces 958, and/or one or more operating systems 941, such as Windows Server, mac OS XTM, unixTM, linuxTM, freeBSDTM, and the like.
In this embodiment, the cpu 922 in the server 900 may perform the following steps;
converting the acquired first instance into an instance task, wherein the first instance belongs to a target service;
determining a target container corresponding to the target service according to the corresponding relation between the service and the container;
if the available computing power resources in the target container are determined according to the computing power resource limit corresponding to the target container, controlling the instance task corresponding to the first instance to enter a task queue of the target container;
And when the available computing power resources exist in the target container, scheduling the target instance task from the task queue to be put into operation.
According to an aspect of the present application, there is provided a computer readable storage medium for storing program code for executing the computing power resource allocation method according to the foregoing embodiments.
According to one aspect of the present application, there is provided a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device performs the methods provided in the various alternative implementations of the above embodiments.
The terms "first," "second," "third," "fourth," and the like in the description of the application and in the above figures, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the application described herein may be implemented, for example, in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
In the several embodiments provided in the present application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be embodied essentially or in part or all of the technical solution or in part in the form of a software product stored in a storage medium, including instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application. The storage medium includes various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RandomAccess Memory RAM), a magnetic disk, or an optical disk.
While the application has been described in detail with reference to the foregoing embodiments, it will be understood by those skilled in the art that the foregoing embodiments may be modified or equivalents may be substituted for some of the features thereof, and that the modifications or substitutions do not depart from the spirit and scope of the embodiments of the application.

Claims (15)

CN202011395313.3A2020-12-032020-12-03 A method, device, equipment and storage medium for allocating computing power resourcesActiveCN112380020B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202011395313.3ACN112380020B (en)2020-12-032020-12-03 A method, device, equipment and storage medium for allocating computing power resources

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202011395313.3ACN112380020B (en)2020-12-032020-12-03 A method, device, equipment and storage medium for allocating computing power resources

Publications (2)

Publication NumberPublication Date
CN112380020A CN112380020A (en)2021-02-19
CN112380020Btrue CN112380020B (en)2025-06-06

Family

ID=74590300

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202011395313.3AActiveCN112380020B (en)2020-12-032020-12-03 A method, device, equipment and storage medium for allocating computing power resources

Country Status (1)

CountryLink
CN (1)CN112380020B (en)

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN112988390B (en)*2021-03-222024-11-01上海超级计算中心Computing power resource allocation method and device
CN113760180B (en)*2021-04-222025-09-09腾讯科技(深圳)有限公司Storage resource management method, device, equipment and computer readable storage medium
CN113590282B (en)*2021-07-192024-12-31海宁奕斯伟计算技术有限公司 Computing power scheduling method, system, electronic device and computer readable storage medium
CN113521753B (en)*2021-07-212023-08-15咪咕互动娱乐有限公司 System resource adjustment method, device, server and storage medium
CN114490056A (en)*2022-01-242022-05-13北京百度网讯科技有限公司 Data processing method, apparatus, device and storage medium
CN114648870B (en)*2022-02-112023-07-28行云新能科技(深圳)有限公司Edge computing system, edge computing decision prediction method, and computer-readable storage medium
CN115017030A (en)*2022-02-222022-09-06哲库科技(北京)有限公司Resource allocation method, device, electronic equipment and storage medium
CN114613193A (en)*2022-03-222022-06-10重庆长安汽车股份有限公司Calculation force sharing-based parking space acquisition method, storage medium, system and vehicle
CN114866430B (en)*2022-03-292025-01-10北京智芯微电子科技有限公司 Edge computing power prediction method, computing power scheduling method and system
CN114675976B (en)*2022-05-262022-09-16深圳前海环融联易信息科技服务有限公司GPU (graphics processing Unit) sharing method, device, equipment and medium based on kubernets
CN115529242B (en)*2022-09-232023-07-18浙江大学Method for realizing cloud network resource allocation under optimal water level
CN115562877B (en)*2022-11-152023-03-24北京阿丘科技有限公司Arranging method, device and equipment of distributed computing power resources and storage medium
CN116431335B (en)*2023-03-212024-06-07哈尔滨工业大学 A container message queue resource quota control method based on control group
CN117611425B (en)*2024-01-172024-06-11之江实验室 Graphics processor computing power configuration method, device, computer equipment and storage medium
CN118331751B (en)*2024-06-142024-08-20济南浪潮数据技术有限公司Computing resource allocation method, computer program, device and medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN109558446A (en)*2018-12-132019-04-02杭州数梦工场科技有限公司Job request method, apparatus, electronic equipment and storage medium
CN109962940A (en)*2017-12-142019-07-02北京云基数技术有限公司A kind of virtualization example scheduling system and dispatching method based on cloud platform
CN110837410A (en)*2019-10-302020-02-25北京奇艺世纪科技有限公司Task scheduling method and device, electronic equipment and computer readable storage medium

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US8990807B2 (en)*2012-08-092015-03-24VCE Company LLCVirtual instance reconfiguration
US10362099B2 (en)*2015-06-232019-07-23Amazon Technologies, Inc.Multiple instance types serving a single workload or application
US10719369B1 (en)*2017-06-012020-07-21Amazon Technologies, Inc.Network interfaces for containers running on a virtual machine instance in a distributed computing environment
CN108769254B (en)*2018-06-252019-09-20星环信息科技(上海)有限公司Resource-sharing application method, system and equipment based on preemption scheduling
CN110457135A (en)*2019-08-092019-11-15重庆紫光华山智安科技有限公司A kind of method of resource regulating method, device and shared GPU video memory

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN109962940A (en)*2017-12-142019-07-02北京云基数技术有限公司A kind of virtualization example scheduling system and dispatching method based on cloud platform
CN109558446A (en)*2018-12-132019-04-02杭州数梦工场科技有限公司Job request method, apparatus, electronic equipment and storage medium
CN110837410A (en)*2019-10-302020-02-25北京奇艺世纪科技有限公司Task scheduling method and device, electronic equipment and computer readable storage medium

Also Published As

Publication numberPublication date
CN112380020A (en)2021-02-19

Similar Documents

PublicationPublication DateTitle
CN112380020B (en) A method, device, equipment and storage medium for allocating computing power resources
US10003500B2 (en)Systems and methods for resource sharing between two resource allocation systems
CN103067468B (en)Cloud dispatching method and system thereof
CN104243405B (en)A kind of request processing method, apparatus and system
CN108829469B (en)Application page display method and device
CN113419846B (en)Resource allocation method and device, electronic equipment and computer readable storage medium
CN107832143B (en)Method and device for processing physical machine resources
CN112463535A (en)Multi-cluster exception handling method and device
CN110162393B (en)Task scheduling method, device and storage medium
CN105808341A (en)Method, apparatus and system for scheduling resources
CN109729113B (en)Method, server system and computer program product for managing dedicated processing resources
CN114327846A (en) Cluster expansion method, device, electronic device, and computer-readable storage medium
CN111930516A (en)Load balancing method and related device
CN109102200B (en)Timed task processing method and device
CN114265648B (en)Code scheduling method, server, client and system for acquiring remote desktop
CN114968500A (en)Task scheduling method, device, equipment and storage medium
CN111813541B (en)Task scheduling method, device, medium and equipment
CN115344350A (en)Node equipment of cloud service system and resource processing method
WO2025103006A1 (en)Serverless computing-based data processing methods and electronic device
CN112156453B (en)Example adaptive adjustment method, apparatus, computer readable storage medium and device
CN116069493A (en)Data processing method, device, equipment and readable storage medium
CN118331715A (en) Task scheduling method, device, equipment and storage medium
CN109426561A (en)A kind of task processing method, device and equipment
CN117651044A (en)Edge computing task scheduling method and device
CN118069302A (en)Data processing method and device, electronic equipment and storage medium

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
REGReference to a national code

Ref country code:HK

Ref legal event code:DE

Ref document number:40038376

Country of ref document:HK

SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp