Summary of the invention
In order to overcome the deficiencies of the above existing technologies, purpose of the present invention is to provide a kind of distribution of concurrent computation resourceMethod and device, to improve the utilization rate of cluster resource.
In order to achieve the above object, the present invention proposes a kind of concurrent computation resource distribution method, include the following steps:
Step 1 receives Current resource distribution request, obtains the total number resource of the current resource allocation request;
Step 2 obtains the resource related information that each clustered node includes in parallel computing trunking;
Step 3 calculates the resource allocation cost of each clustered node according to the resource related information of each clustered node;
Step 4, according to the resource allocation cost of each clustered node and the total number resource to the current resource allocation request pairThe request task answered carries out resource allocation.
Further, step 3 further comprises:
The transition cost generated when calculating each clustered node resource allocation according to the resource related information of each clustered node,The transition cost is the cost that energy level transition occurs when splitting resource group in clustered node every time and generates;
Each cluster after having distributed resource release in each clustered node is calculated according to the resource related information of each clustered nodeThe remaining cost of node;
Resource allocation when each clustered node distribution resource is calculated according to the transition cost of each clustered node and remaining costCost.
Further, the transition cost obtains as follows:
The fractionation cost that resource group generates is split when calculating each clustered node resource allocation;
Transition cost when each clustered node resource allocation is calculated according to the fractionation cost for splitting resource group.
Further, the number of resources for splitting cost and including by surplus resources number in current cluster node and resource groupIt calculates and obtains, or directly setting obtains.
Further, the remaining cost is obtained by surplus resources number in clustered node.
Further, the resource related information include each resource group includes in current cluster node number of resources, whenAll total number resources in preceding clustered node use total number resource and surplus resources sum.
Further, the resource related information is obtained by the heartbeat signal that each clustered node is sent.
Further, in step 4, select the smallest clustered node distribution of resource allocation cost mutually should total number resourceResource give the request task.
In order to achieve the above objectives, the present invention also provides a kind of concurrent computation resource distributors, comprising:
Resource allocation request receiving module obtains Current resource distribution request for receiving Current resource distribution requestTotal number resource;
Resource information obtains module, the resource correlation letter for including for obtaining each clustered node in parallel computing trunkingBreath;
Resource allocation cost computing module calculates the money of each clustered node according to the resource related information of each clustered nodeCost is distributed in source;
Resource distribution module, according to the resource allocation cost of each clustered node and the resource of the current resource allocation requestSum carries out resource allocation to the corresponding request task of the current resource allocation request.
Further, the resource allocation cost computing module includes:
Transition cost calculating unit, for calculating each clustered node resource point according to the resource related information of each clustered nodeThe transition cost that timing generates, the transition cost are that energy level transition occurs and generates when splitting resource group in clustered node every timeCost;
Remaining cost calculating unit has divided for being calculated in each clustered node according to the resource related information of each clustered nodeRemaining cost with each clustered node after resource release;
Cost calculating unit is distributed, each clustered node is calculated according to the transition cost of each clustered node and remaining cost and is distributedResource allocation cost when resource.
Further, resource group is split when the transition cost calculating unit calculates each clustered node resource allocation firstThe fractionation cost of generation calculates transition cost when each clustered node resource allocation further according to the fractionation cost for splitting resource group.
Further, the number of resources for splitting cost and including by surplus resources number in current cluster node and resource groupIt calculates and obtains, or directly setting obtains.
Further, the remaining cost is obtained by surplus resources number in clustered node.
Compared with prior art, a kind of concurrent computation resource distribution method of the present invention and device are by carrying out resource allocationBefore according to the resource related information for including in each clustered node, when calculating each clustered node distribution respective numbers resourceCost is distributed, so that selection generates the less clustered node of fragment and carries out resource allocation, effectively when ensure that request resource every timeGround improves the utilization rate of entire cluster resource.
Specific embodiment
Below by way of specific specific example and embodiments of the present invention are described with reference to the drawings, those skilled in the art canUnderstand further advantage and effect of the invention easily by content disclosed in the present specification.The present invention can also pass through other differencesSpecific example implemented or applied, details in this specification can also be based on different perspectives and applications, without departing substantially fromVarious modifications and change are carried out under spirit of the invention.
Fig. 1 is a kind of step flow chart of concurrent computation resource distribution method of first embodiment of the invention.As shown in Figure 1,A kind of concurrent computation resource distribution method of the present invention, includes the following steps:
Step 101, Current resource distribution request is received, the total number resource of Current resource distribution request is obtained.
The total number resource of the Current resource distribution request is generally the number of resources needed when executing new task;The resourceRefer to that each clustered node distributes least resource unit when resource in the task of execution, such as in parallel computing trunking nodeCarry out parallel computation using multiple GPU, then when single GPU is that clustered node executes task, the least resource unit of distribution.ThisIt include multiple clustered nodes in cluster when parallel computation, each clustered node includes one or more resource groups, each in inventionIt include the resource of fixed quantity in resource group.
Step 102, the resource related information that each clustered node includes in parallel computing trunking is obtained.The resource is relatedInformation includes the number of resources that each resource group includes in current cluster node, all total number resources, use in current cluster nodeTotal number resource and surplus resources sum in embodiments of the present invention, the heartbeat that resource related information can be sent by each clustered nodeSignal obtains, that is to say, that each clustered node periodically can send heartbeat signal to distributor, to prove oneself to be still active in collectionGroup in, the heartbeat signal include each clustered node itself ID number and corresponding resource related information, but not limited to this.
Step 103, according to the resource related information of each clustered node, the resource allocation cost of each clustered node is calculated.
In the present invention, resource allocation cost consists of two parts: when each clustered node resource allocation, splitting resource groupThe remaining cost after resource is released has been distributed in the energy level transition cost of generation and each clustered node.Fig. 2 is the present invention theThe detailed flowchart of step 103 in one embodiment.As shown in Fig. 2, step 203 further comprises following steps:
Step S1, the jump generated when calculating each clustered node resource allocation according to the resource related information of each clustered nodeMove cost.The transition cost refers to the energy level transition that occurs when splitting resource group in clustered node every time and the cost that generates.ByIt is limited to hardware, when distributing resource in clustered node, if necessary to split resource group, the cost of cost is often higher, i.e. fractionation generationValence is higher;If you do not need to splitting set of resources matches resource, the cost of cost is often lower, and well below fractionation resource groupWhen the fractionation cost that spends, can be ignored, therefore, the calculating process of transition cost is as follows:
1) the fractionation cost that resource group generates is split first in computing cluster node when resource allocation
Splitting cost only can just generate when resource group in splitting clustered node, split in cost and current cluster nodeThe number of resources that surplus resources number and resource group include is related, can pass through surplus resources number in current cluster node and resource group packetThe number of resources contained, which calculates, to be obtained, in the specific embodiment of the invention, specific formula for calculation such as following formula (1):
Wherein, k indicates the fractionation cost generated when splitting resource group in clustered node;R indicates to remain in current cluster nodeRemaining number of resources, g are the number of resources that resource group includes in clustered node, and j is to split cost adjustment parameter, and value is more than or equal to 1Integer, specific value can be determined according to application demand or experimental result, it acts as guaranteeing that the value of k is 0 or 1, i.e., ought be torn openWhen dividing resource group, value 1, when not splitting resource group, value 0;
It should be noted that above-mentioned formula (1) is merely illustrative calculation method, it is of course also possible to not use what above formula provided to tear openDivide cost calculation method, directly setting clustered node splits the fractionation cost of resource group every time, such as directly set when distributing resourceFractionation cost when the fixed resource group of fractionation every time is 1, when not splitting resource group, and splitting cost is 0, and invention is not limited thereto.
2) transition cost when each clustered node resource allocation is calculated according to the fractionation cost for splitting resource group
When there are multiple resource groups in clustered node, cluster is caused splitting resource group progress resource allocation Shi Douhui every timeThe transition of node energy level generates transition cost, and the energy level transition refers to when not splitting resource group in cluster when resource allocation, energy levelIt does not change, after splitting resource group, energy level is reduced, and transition occurs for energy level, and then generates transition cost.
It is generated when all fractionation resource groups in each clustered node of transition cost when each clustered node resource allocationThe sum of cost is split, if Fig. 3 is that the energy level transition generated when splitting resource group in clustered node in the specific embodiment of the invention is shownIt is intended to, wherein the clustered node includes 8 resources altogether, and 2 resource groups, fixed each resource group includes 4 resources, works as clusterWhen middle resource drops to 4 by 8 or drops to 0 by 4, transition occurs for energy level, generates transition cost, and the transition cost passes through such as following formula(2) it calculates and obtains:
Wherein, TR-aFrom the transition cost after a resource of R resource allocation after expression clustered node distribution resource, a is indicatedRequest number of resources, kiIndicate that fractionation cost when clustered node distribution resource has split money only when distributing i-th of resourceSource group, kiValue is 1, otherwise, value 0.
Illustrate the calculating process of each clustered node transition cost below by way of a specific example.It is assumed that request number of resourcesIt is 2, and includes two clustered nodes node1 and node2, the relevant information of each clustered node such as following table in parallel computing trunkingShown in 1:
Table 1
Wherein, g indicates the number of resources that resource group includes in each clustered node, and a indicates request number of resources, and Total is indicatedThe total number resource that each clustered node includes, Used indicate the assigned number of resources of each clustered node, R1、R2Table respectivelyShow clustered node surplus resources number after distributing resource;The calculating of the transition cost of each clustered node in the specific embodiment of the inventionProcess is as follows:
1) the transition cost of node1 node is calculated
It first calculates node1 node and distributes the fractionation cost after two resources, since remaining 5 resources of node1 node are available,Each resource group includes 4 resources, when node distributes the 1st resource, resource group is not split, when node distributes the 2nd moneyWhen source, need to split a resource group, therefore, can calculate according to formula (1) and split cost is 1;Due to node1 node distribution 2A resource group is split when a resource altogether;Therefore, transition cost when node1 node is reduced to 3 resources by 5 resources is instituteHave and splits the sum of cost, i.e., 1;
2) the transition cost of node2 node is calculated
It equally first calculates node2 node and distributes the fractionation cost after two resources, due to remaining 3 resources of node2 nodeIt can use, each resource group includes 4 resources, after node distributes two resources, does not split resource group, and splitting cost is 0;It therefore is 0 by the transition cost that 3 resources are reduced to 1 resource after node2 node distributes 2 resources.
Step S2 is calculated according to the resource related information of each clustered node after having distributed resource release in each clustered nodeThe remaining cost of each clustered node.
It, can be whole by resource in clustered node due in actual use, guaranteeing the integrality of resource in each clustered nodeBody distributes to the biggish prior task of resource consumption, therefore value with higher only lacks a money in clustered nodeWhen source, the potential value having is higher, because only needing to discharge the integrality that a resource ensures that the clustered node,When carrying out resource allocation, task directly that resource allocation in entire clustered node is bigger to resource consumption.For this purpose, the present invention existsWhen calculating the resource allocation cost of each clustered node, it is also contemplated that allocated resource collects upon discharge in each clustered nodeThe remaining cost of group node, the residue cost refers to since clustered node is when distributing resource, does not consider to have divided in clustered nodeLead to the impaired generated cost of potential value with the integrality of clustered node can be kept after node release, specific residue generationThe calculation method of valence is as follows:
The remaining cost of the clustered node is related to surplus resources number in clustered node, can be by remaining in clustered nodeNumber of resources obtains, and when surplus resources are more in clustered node, and surplus resources are distributed to the task of request resource, loss is divedIt is bigger being worth, to keep the remaining cost of clustered node bigger;Therefore, the remaining cost and clustered node of the clustered nodeIn remaining number of resources be inversely proportional, remaining cost can be used such as following formula (3) calculate obtain:
Wherein, LRRemaining cost when for clustered node R resource of residue, R are that clustered node distributes the residue before resourceNumber of resources;
Step S3, when calculating each clustered node distribution resource according to the transition cost of each clustered node and remaining costDistribute cost.
In the specific embodiment of the invention, it is described distribution cost be each clustered node distribute resource after transition cost withThe sum of remaining cost is obtained using such as following formula (4):
CR-a=TR-a+LR (4)
Wherein, CR-aFor the distribution cost in clustered node when remaining R resource after a resource of distribution.It should be noted hereinIt is that formula (4) is merely illustrative, transition cost and remaining cost can also be weighted after processing and calculates distribution generationValence, invention is not limited thereto.
Illustrate the calculating process for distributing cost below by way of a specific example: such as current request number of resources is 2The resource related information of clustered node node1 and node2 are as shown in table 2 below, and design parameter meaning is identical as table 1 in table, wherein CDistribution cost when resource is distributed for each clustered node, other parameters meaning is identical as table 1, clustered node node1 and node2Distribution cost C calculating process it is as described below:
Table 2
Transition cost after first calculating each clustered node distribution resource, since node1 and node2 node distributes 2 moneysResource group is not split behind source, therefore fractionation cost and transition cost are all 0;
Calculate the remaining cost of node1 node: 1-1/7=6/7;
Calculate the remaining cost of node2 node: 1-1/2=1/2;
Therefore, the distribution cost of node1 node distribution resource is 0+6/7=6/7, and node2 node distributes the distribution of resourceCost is 0+1/2=1/2.
Step 104, according to the resource allocation cost of each clustered node and resource request sum to the current resource allocationCorresponding request task is requested to carry out resource allocation.In the embodiment of the present invention, in specific distribution, then select distribution cost minimumClustered node distribute the resource of corresponding requested resource sum to the request task.
Since clustered node each in parallel computing trunking can include one or more resource groups, each resource group includes to fixThe resource allocation methods of the resource of quantity, the prior art are allocated resource in cluster only with mutual exclusion, exclusive mode, notConsider the distribution cost generated when resource allocation, but directly give the resource allocation for meeting demand to demand task, often tears openThe set of resources divided in each clustered node matches resource.In this case, if surplus resources are less in a resource group, next demandWhen task requests resource, it is often unable to satisfy requirement, thus the fragment for generating resource fragmentation, and generating over timeCan be more and more, when there is larger task to need more resource, resource fragmentation is unable to satisfy demand, larger task in clustered nodeAfter needing to wait and having distributed resource release, the resource of quantity required can be just obtained, thus it is longer to will cause task waiting time, collectionThe problems such as group's parallel processing capability decline, resource utilization is lower, the present invention not only considers each clustered node in resource allocationTransition cost after distribution request number of resources resource, while considering the clustered node that each clustered node has distributed after resource releaseRemaining cost, resource point is carried out to request task according to the resource allocation cost of each clustered node and resource request sumMatch, problem of the existing technology can be well solved.
Fig. 4 is a kind of structural schematic diagram of concurrent computation resource distributor of second embodiment of the invention.As shown in figure 4,A kind of concurrent computation resource distributor of the present invention, comprising: resource allocation request receiving module 10, resource information obtain module11, resource allocation cost computing module 12 and resource distribution module 13.
Wherein, resource allocation request receiving module 10 obtains Current resource distribution for receiving Current resource distribution requestThe total number resource of request.The total number resource of the Current resource distribution request is generally the number of resources needed when executing new task;The resource refers to that each clustered node in the task of execution, distributes least resource unit when resource, such as parallel computing trunking sectionCarry out parallel computation using multiple GPU in point, then when single GPU is that clustered node executes task, the least resource list of distributionPosition.It include multiple clustered nodes in cluster when parallel computation in the present invention, each clustered node includes one or more resource groups,It include the resource of fixed quantity in each resource group.
Resource information obtains module 11, the resource correlation letter for including for obtaining each clustered node in parallel computing trunkingBreath.The resource related information includes the number of resources that each resource group includes in current cluster node, institute in current cluster nodeThere is total number resource, using total number resource and surplus resources sum.In the specific embodiment of the invention, resource related information can pass throughThe heartbeat signal that each clustered node is sent obtains, that is to say, that and each clustered node periodically can send heartbeat signal to distributor,To prove oneself to be still active in cluster, the heartbeat signal includes the ID number and corresponding resource phase of each clustered node itselfClose information.
Resource allocation cost computing module 12 calculates each clustered node according to the resource related information of each clustered nodeResource allocation cost.
In the present invention, resource allocation cost consists of two parts: when each clustered node resource allocation, splitting resource groupThe remaining cost after resource is released has been distributed in the energy level transition cost of generation and each clustered node.Fig. 5 is the present invention theThe detail structure chart of resource allocation cost computing module in two embodiments.As shown in figure 5, resource allocation cost computing module 12 intoOne step includes: transition cost calculating unit 120, remaining cost calculating unit 121, distribution cost calculating unit 122.
Transition cost calculating unit 120, for calculating each clustered node according to the resource related information of each clustered nodeThe transition cost generated when resource allocation.The transition cost refers to the energy level jump occurred when splitting resource group in clustered node every timeThe cost moved and generated.It is limited to hardware, when distributing resource in clustered node, if necessary to split resource group, the cost of costOften higher, i.e. fractionation cost is higher;If you do not need to splitting set of resources matches resource, the cost of cost is often lower, andWell below the fractionation cost spent when splitting resource group, can be ignored, therefore, the meter of transition cost calculating unit 120Calculation process is as follows:
1) the fractionation cost that resource group generates is split when first calculating resource allocation in each clustered node
The fractionation cost only can just generate when resource group in splitting clustered node, the fractionation cost and current collectionSurplus resources number and resource group include that number of resources is related in group node, can pass through surplus resources number and money in current cluster nodeThe number of resources that source group includes, which calculates, to be obtained, and in the specific embodiment of the invention, circular is obtained using following formula:
Wherein, k indicates the fractionation cost generated when splitting resource group in clustered node;R indicates to remain in current cluster nodeRemaining number of resources, g are that resource group includes number of resources in clustered node, and j is to split cost adjustment parameter, for the integer more than or equal to 1,Specific value can be determined according to application demand or experimental result, it acts as guaranteeing that the value of k is 0 or 1, i.e., when fractionation resource groupWhen, value 1, when not splitting resource group, value 0;
It should be noted that above formula is merely illustrative calculation method, it is of course also possible to which not using above formula to provide splits cost meterCalculation method, directly setting clustered node split the fractionation cost of resource group, such as directly setting is torn open every time when distributing resource every timeDividing fractionation cost when resource group is 1, when not splitting resource group, and splitting cost is 0;Invention is not limited thereto.
2) transition cost when being distributed according to the fractionation cost computing cluster node resource for splitting resource group
When there are multiple resource groups in clustered node, cluster is caused splitting resource group progress resource allocation Shi Douhui every timeThe transition of node energy level generates transition cost, and the energy level transition refers to when not splitting resource group in cluster when resource allocation, energy levelIt does not change, after splitting resource group, energy level is reduced, and transition occurs for energy level, and then generates transition cost.
What transition cost when clustered node resource allocation generated when being all fractionation resource groups in each clustered node tears openDivide the sum of cost.Following formula acquisition can be used in the transition cost:
Wherein, TR-aFrom the transition cost after a resource of R resource allocation after expression clustered node distribution resource, a is indicatedRequest number of resources, kiIndicate that fractionation cost when clustered node distribution resource has split money only when distributing i-th of resourceSource group, kiValue is 1, otherwise, value 0.
Remaining cost calculating unit 121, for calculating each clustered node according to the resource related information of each clustered nodeIn distributed resource release after each clustered node remaining cost.
It, can be whole by resource in clustered node due in actual use, guaranteeing the integrality of resource in each clustered nodeBody distributes to the biggish prior task of resource consumption, therefore value with higher only lacks a money in clustered nodeWhen source, the potential value having is higher, because only needing to discharge the integrality that a resource ensures that the clustered node,When carrying out resource allocation, task directly that resource allocation in entire clustered node is bigger to resource consumption.For this purpose, the present invention existsWhen calculating the resource allocation cost of each clustered node, it is also contemplated that allocated resource collects upon discharge in each clustered nodeThe remaining cost of group node, the residue cost refers to since clustered node is when distributing resource, does not consider to have divided in clustered nodeLead to the impaired generated cost of potential value with the integrality of clustered node can be kept after node release, specific residue generationThe calculating process of valence is as follows:
The remaining cost of the clustered node is related to surplus resources number in clustered node, can be by remaining in clustered nodeNumber of resources obtains, and when surplus resources are more in clustered node, and surplus resources are distributed to the task of request resource, loss is divedIt is bigger being worth, to keep the remaining cost of clustered node bigger;Therefore, the remaining cost and clustered node of the clustered nodeIn remaining number of resources be inversely proportional, it is described residue cost calculation method be shown below:
Wherein, LRRemaining cost when for clustered node R resource of residue, R are that clustered node distributes the residue before resourceNumber of resources;
Cost calculating unit 122 is distributed, for calculating each collection according to the transition cost and remaining cost of each clustered nodeGroup node distributes distribution cost when resource.
In embodiments of the present invention, the distribution cost is that each clustered node distributes transition cost and residue after resourceThe sum of cost is obtained using following formula:
CR-a=TR-a+LR
Wherein, CR-aFor the distribution cost in clustered node when remaining R resource after a resource of distribution.
Resource distribution module 13, according to the resource allocation cost of each clustered node and resource request sum to request taskCarry out resource allocation.In the embodiment of the present invention, in specific distribution, then select the smallest clustered node distribution of distribution cost correspondingThe resource of resource request sum is to request task.
In conclusion a kind of concurrent computation resource distribution method of the present invention and device pass through the root before carrying out resource allocationAccording to the resource related information for including in each clustered node, distribution generation when each clustered node distribution respective numbers resource is calculatedValence effectively improves so that selection generates the less clustered node of fragment and carries out resource allocation when ensure that request resource every timeThe utilization rate of entire cluster resource.
Anyone skilled in the art without departing from the spirit and scope of the present invention, repair above-described embodimentDecorations and change.Therefore, the scope of the present invention, should be as listed in the claims.