Disclosure of Invention
The present invention provides a dynamic task oriented multi-robot distributed task allocation formation method that overcomes or at least partially solves the above-mentioned problems by solving the problems of multiple dynamic tasks in an environment through cooperative cooperation between robots based on an improved auction algorithm and in conjunction with a-star algorithm path optimization.
In order to achieve the above object, the present invention provides a dynamic task-oriented multi-robot distributed task allocation forming method, which comprises:
s1: according to the information in the environment map, the factors and difficulties which need to be considered in the task allocation problem are cleared;
s2: when a task occurs, a task allocation scheme is generated based on a multi-stage auction algorithm, and the robot executes the task according to the task allocation scheme.
Further, the step S1 includes:
s11: a target environment map is given, and a plurality of robots with task execution capacity, task points with attributes changing along with time and a plurality of obstacles are distributed on the map;
s12: according to the state change of the task points, the factors and difficulties needing to be considered for cleaning task distribution
Further, in step S11, the environment is a robot work environment, a rectangular coordinate system is established for a given environment map, the coordinate system uses the x axis to the right as a positive direction and the y axis to the positive direction, the work environment is divided into a plurality of grids, and N (N belongs to Z) is spatially distributed on the environment map+) Task point, M (M belongs to Z)+) Individual robot and B (B ∈ Z)+) And determining coordinates of the robot, the task and the obstacle, respectively.
Where the task is denoted by the letter j, where j equals 1,2, …, N, the task approximates a point on the map. The robot is denoted by the letter i, where i ═ 1,2, …, M. The obstacle is denoted by the letter B, where B is 1,2, …, B.
The coordinates of the task point on the map are
Wherein j is 1,2, …, N; the coordinates of the robot on the map are
Wherein i is 1,2 … M; the coordinates of the obstacle on the map are
Where k is 1,2, …, B.
The distance formula between the task point and the robot is as follows:
suppose that the j coordinate of the task point and the j' coordinate of the task point are respectively
The distance between the two is expressed as:
further, in step S12, the state quantity of the task point j (j ═ 1,2, …, N) at time (t +1) changes exponentially with time, and the state quantity of the task point j (j ═ 1,2, …, N) is expressed as:
in the formula:
Sj(t+1),Sj(t) task point state quantities of task j (j ═ 1,2, …, N) at stages (t +1) and t, respectively;
αjthe state change rate of the jth task point;
βiexecution capabilities of robot i and βi>0;
The time difference between t and (t + 1);
the task quantity threshold value is set, whether the task is finished or not is judged, and when s is reached, the task quantity threshold value is setj(t) < ε, this indicates that the task is complete.
This is a task state distributed dynamic task allocation problem. The state change of the task point is in a continuous exponential function form (equation three). The task volume is time-distributed and is affected by how many robots are performing the task (the sum of the performance capabilities of the agent).
The completion time of task point j (j ═ 1,2, …, N) is inversely proportional to the sum of the performance capabilities acting on task point j. The execution capacity of m robots for executing the target task j at the stage t and
then, it indicates that the m robots cannot complete the task point, s
j(t) exhibits an ascending trend; when in use
Then m robots are shown to be able to complete the task point, s
j(t) shows a tendency to decline.
Further, in step S12, the system task assignment needs to consider factors including task point attribute parameters and robot attribute parameters.
Attribute parameter of task point j (j is 1,2, … N)The number of the components comprises: task quantity (i.e. state value) S of task point
j(t) indicating the task amount of the task point at time t, wherein j is 1,2, …, N, and when t is 0, indicating the initial amount S of the task point
j(0) Duty point change rate α
j(ii) a The task points j are distributed in space, and the positions of the task points j on the environment map
The attribute parameters of the robot i (i ═ 1,2, …, M) include: velocity v of movement
iThe length of the path taken by the robot i in unit time, and a robot capability value β
i(ii) a The position of the robot on the environment map
With the increase of the number of target task points and robots, the difficulty of task allocation problem increases rapidly, and the importance degree and the state change rate of each target task point, the cost and the execution capacity of each intelligent agent when executing tasks are all factors to be considered.
Assuming that the robot that is heading for the jth task point does not go to other task points until the next phase of solution generation, for a given M (M ∈ Z)
+) Personal robot, N (N is belonged to Z)
+) A task point, which is shared by multiple auctions
A distribution scheme having a complexity of
In step S2, the method further includes:
s21: when a task is detected, selecting a robot closest to a task point as a proxy robot; if the robot closest to the plurality of task points is the same robot i (i is 1,2, …, M), the system selects the robot i of the task point closest to the origin of the rectangular coordinate system from the plurality of task points as the proxy robot;
s22: the agent robot issues auction information to the auction robots within the communication radius to wait for feedback;
s23: the auction robot receiving the auction information selects the task point with the highest profit to bid according to the profit function of the auction robot;
s24: the agent robot collects the bidding information, selects the winning robot according to a preset rule and informs the winning robot to execute the task;
s25: and selecting the nearest task point for the auction robots not winning the bid and the robots in the communication blind areas.
Further, in step S2, when detecting that the task point and the idle robot without targets have been completed, the idle robot is re-auctioned, so that the idle robot obtains a new task point again, thereby obtaining a new allocation scheme, and the auction is stopped until all task points in the environment are executed.
The target-oriented task of the invention is dynamically changed along with time, and the task points are distributed on the completed time point, namely the task points are inconsistent on the executed time.
If a traditional auction algorithm flow is used, after an allocation scheme is generated, a dynamic target task is executed according to the scheme, and it is difficult to adapt to the dynamic change and real-time requirements of the target task.
The improved auction algorithm can track the state change of the task points in real time, once the task points are finished, the robot without targets can exist, and the algorithm can redistribute the task points, so that a new distribution scheme is obtained.
Further, in step S22, the agent robot is responsible for issuing auction information to the auction robots within the communication radius, and waits for feedback; the agent robot of each task point j (j is 1,2, …, N) grasps the relevant auction information of the task point j, and the message includes: the position of the task point j, the change rate of the task point j, the state quantity of the task point j and the number s of suitable robots needed by the task point j. The state quantities of the task points change exponentially with time.
Number of suitable robots s required for task point j
jIs defined as: suppose there are w robots such that
And is
The optimal number s of robots needed by the task point j takes the following values:
w≤sjw +2 ≤ (four type)
The robots i (i is 1,2, …, M) all have corresponding communication radiuses ri(the communication radius of the robot can be set to be different values), and the agent robot is responsible for issuing auction information to the auction robots within the communication radius.
Further, in step S23, the auction robot that has received z (0 ≦ z ≦ N) pieces of proxy robot auction information balances the information with the following balance criteria: the auction robot analyzes the z auction information issued by the agent robot, the traditional auction algorithm is improved, a dynamic economic benefit function is introduced, and the robot i calculates the income value E after executing a certain task point jij(EijReal number) and selects the most profitable task point to bid. The dynamic revenue function used is:
Eij=g(k1,Sj(t),αj,βi,n)-h(k2,dij) (formula five)
In the formula:
g, the robot i obtains benefits after executing the task point j and is an exponential function;
h is the cost consumed by the robot i to execute the task point j;
k1, k2 variable weight parameters, which can be set to their respective values under different circumstances;
dijthe path length between the robot i and the task point j is long;
in the above formula, k1 and S
jThe product of (t) is the scaling factor of an exponential function, α
j,β
iN is the relation between the three
Forming an index in the index function, n representing the number of robots that are executing the task point j (when t is 0, the task point is not executed by a robot, and n is 0);
h is dijFunction, given the parameter k2, the value of h is given by dijAnd (6) determining.
Each robot has a corresponding execution capacity value β for a given environment mapi(i ═ 1,2, …, M), the ability value of this robot i is a constant, and the execution ability values between robots may be set to be different.
Further, in step S24, after receiving the bidding information of the auction robots, the agent robot analyzes the information, selects the auction robot that wins the bid according to the target with the greatest overall benefit, and notifies all the bidding auction robot results.
Further, in step S25, the robots in the communication blind areas in the map (there may be robots that cannot receive the auction information issued by the agent robot due to the limitation of the communication radius in the environment map) and the auction robots that have not bid themselves to execute the task points closest to the robots.
Further, in step S24, the agent robot may select the auction robot that wins the bid according to a predetermined rule, where the specific content is:
proxy robot acceptance n for task point jjBid message of each auction robot if nj>sjThen sort by profit size, choose top ranked(s)j-1) one auction robot acts as a middling robot and informs all bidding auction robots if nj<sjThen n isjAll the auction robots are the middle-standard robots.
Further, in the step S24, in the process of the robot traveling, the path is optimized through the a-star algorithm, so as to reasonably avoid collision with other robots and obstacles, and search out the optimal collision-free path for the robot to reach the corresponding task point.
And generating a task allocation scheme, starting to execute the target task point by the robot according to the allocated scheme, displaying the position of the robot in real time in the forward target task process by the robot, and searching the optimal collision-free path through an A-star algorithm.
In the searching process of the A-star algorithm, a method for searching the extended space node by a cost function is adopted, and the general form of the cost function is as follows:
(i) g (i) + h (i) (formula six)
In the formula:
(i) a valuation function of node i from the initial node to the target node;
g (i) the actual cost from the initial node to node i in the state space;
h (i) estimated cost of best path from node i to the target node.
The optimal collision-free path is searched through an A-star algorithm, the robot can reasonably avoid obstacles and cannot meet the obstacles in the traveling process, and the A-star is an effective method for solving the shortest path, so that the real-time and environment adaptability efficiency of path planning is improved.
The robot follows the following rules throughout the execution:
a) the optimal number of robots required for task point j is sj;
b) The robot which is going to the jth task point does not go to other task points before the scheme of the next stage is generated;
c) the robot cannot go to the executed task point;
d) the agent robot selects the auction robot for winning the bid according to a rule established in advance;
e) the auction robot which does not bid or the communication blind area robot (the robot which can not receive the auction information sent by the agent robot in the communication radius) can execute the nearest task point;
f) the agent robot executes the task point of the self agent.
Through the technical scheme, a set of complete execution scheme is finally generated, and the time required for completing all task points is obtained. The time required to complete all task points is the time at which the last task point was executed.
Based on the technical scheme, the multi-stage auction algorithm facing the dynamic tasks is provided, the problem of dynamic task allocation in the environment is solved, the given task is solved by one-time allocation adopted by the traditional auction algorithm, and the expected effect cannot be achieved due to the fact that the dynamic tasks are limited greatly. Through the realization of multiple auctions, the aim of time optimization is fulfilled, the resources of the robot are utilized to the greatest extent, the method is simulated on VC + + and Csharp platforms, and a large number of experimental simulation test results show that the improved auction algorithm can better solve dynamic tasks in the environment compared with the traditional auction algorithm, meets the real-time requirement through multiple distribution, and can provide an approximate optimal solution.
Detailed Description
The following detailed description of embodiments of the present invention is provided in connection with the accompanying drawings and examples. The following examples are intended to illustrate the invention but are not intended to limit the scope of the invention.
The invention provides an improved auction algorithm and a method for solving the problem of a plurality of dynamic tasks in the environment through cooperative cooperation between robots in combination with A-star algorithm path optimization.
The multi-stage auction algorithm is improved on the basis of the traditional auction algorithm, and corresponding rules and dynamic profit functions are added. The traditional auction algorithm is suitable for solving the distribution problem of static target tasks, the value of the target tasks and the execution cost of the robots are respectively measured through benefit and cost parameters, the auction algorithm of the target tasks to the robots is designed, a distribution scheme is generated, and the whole system only carries out auction once.
The robot collaborative cooperation is embodied as follows: when the robot detects that a task point occurs, the agent robot issues auction information to the auction robot, and the auction robot balances and bids according to the received information; the agent robot selects an allocation scheme according to the feedback information and a preset rule, the robot sends the known information to the robot in the communication radius of the robot in the process of executing the task, resource sharing is achieved, once the task points are completed, the idle robot obtains target task points through re-auction to assist other robots to execute the task points until all the task points are completed.
According to an embodiment of the present application, as shown in fig. 1, the environment model diagram of the present invention includes the following steps:
firstly, a rectangular coordinate system is established for a target map, the target map is divided into a plurality of map blocks with the same size, and the length and the width of each map block are equal.
Secondly, in the embodiment of the present invention, given that the length and width of the target map are all 300 units, the map contains 89401 map blocks.
Given an environment map, M (M E Z) can be set specifically+) Personal robot, N (N is belonged to Z)+) A task point, and B (B ∈ Z)+) An obstacle.
And calculating and storing the distance between the task points (formula two) and the distance between the task points and the robot (formula one).
Finally, the study objective of the invention was analyzed: aiming at a plurality of robots with certain task execution capacity, a plurality of tasks with the attributes changing along with time are given, a multi-robot dynamic task allocation strategy is established, the completion time of all tasks is optimized, and the completion time of the last task can be used for measurement.
When M (0 ≦ M) robots act on task point j (j ≦ 1,2, …, N), the task state quantity formula for task point j at (t +1) (t ≧ 0) is:
as shown in FIG. 2, the multi-stage auction algorithm employed in the present invention is a flow chart, detailed as follows:
in S201, when a task occurs, a proxy robot for each task point is determined, and a robot closest to the task point j is selected as a proxy robot for the task point j.
The workflow of the agent robot is shown with reference to fig. 3, and is detailed as follows:
after receiving the message of the monitoring system, the robot i (i is 1,2, …, M) closest to the task point j (j is 1,2, …, N) is the proxy robot of the task point, the robot i first knows the task specific information of the task point j and sorts the task specific information, mainly calculates the optimal number s of robots required for completing the task j and obtains the change rate α of the task pointjThe task state quantity of the task point, the geographic position of the task point and the like. And then sending auction information to the auction robots within the communication radius to wait for feedback. And the auction robots receiving the auction information select the task points with the highest profit to bid according to the profit functions of the auction robots, the agent robot selects the auction robots winning the bid with the highest overall benefit as the target, and notifies all the bidding auction robots. And if the number n of the landmark robots reaches the condition that n is larger than s, selecting s robots with top profit ranks, and reselecting the task points closest to the landmark robots.
In S203, the auction robot workflow refers to fig. 4, which is detailed as follows:
if the auction robot is not selected as the agent robot, the auction robot is automatically converted into the auction robot, and the agent robot firstly issues auction information to the auction robots within the communication radius to wait for feedback. The auction robot receiving the auction information selects the task point with the highest profit to bid according to the profit function of the auction robot, and the agent robot collects the bid information and selects the auction robot winning the bid with the highest overall benefit of the system as the target. And if the auction robot successfully wins the bid, starting to execute the task, and if the auction robot fails to win the bid, selecting to execute the nearest task point. And if the task point is completed, the idle robots are redistributed according to the current situation until the task is completed.
And generating a distribution scheme, starting to execute a task, optimizing the path by the robot through an A-Star algorithm in the process of traveling, reasonably avoiding collision with other robots and obstacles, and searching out the optimal collision-free path for the robot to reach a corresponding task point.
Once there are completed task points, there will be untargeted idle robots, and the system will re-auction them, resulting in a new allocation scheme until all task points in the environment are completed.
The robots cooperatively perform the task test.
The above-mentioned method of the improved auction algorithm (multi-stage auction algorithm) has M robots, N task points, and B obstacles. In that
(m is the number of robots performing task point j). In order to test the improvement of the auction algorithm in performing the dynamic task, the test was performed using a map having a length and a width of 300 grids, comparing the effects of the single-stage auction algorithm and the multi-stage auction algorithm.
Fig. 5 and fig. 6 are both test examples, in which the thick black small circle is a proxy robot, and therobot 3 is a communication blind area robot, specifically as follows:
fig. 5 is a simulation diagram of a test single-stage auction algorithm, whererobots 2,4, and 5 are proxy robots for task points 2,3, and 1, respectively, and the proxy robots distribute auction information corresponding to the task points to auction robots within a communication radius; after the first round of auction, theauction robots 1,6 and 7 successfully bid for winning the bid and respectively execute the task points 2,3 and 1. Therobot 3 selects thetask point 1 with the closest execution distance for the robot with the communication blind area (which is not in the communication radius range of the proxy robot and cannot accept auction information issued by the proxy robot), the robot starts to execute the target task point after the task allocation scheme is generated, the auction is not carried out in the execution process, and once the task point is executed, the idle robot does not assist other robots to execute the unfinished task point.
Fig. 6 is a simulation diagram of a certain testing multi-stage auction algorithm, in which the initial allocation method is the same as the single-stage auction algorithm allocation method, and the multi-stage auction algorithm is characterized in that once there is a completed task point, there are targets-free robots, and the algorithm reallocates them, in the diagram, thetask point 1 is executed first, and the algorithm reallocates therobots 3,5, and 7, so that the idle robots are not wasted, and the goal of designing the allocation scheme in stages is achieved by utilizing the resources of the robots to the maximum extent through multiple auctions.
In order to verify the efficiency of the improved auction algorithm, 8 sets of test data were selected, the communication radius was set to 200 units, k1 is 1, k2 is 0.0025; the statistical results are shown in table 1.
Table 1 is a statistical table of test times
The simulation result is drawn into a line graph, as shown in fig. 7, the comparison result is carried out, after corresponding rules and dynamic revenue functions are introduced for the dynamic tasks, the overall effect of the robot for executing the dynamic tasks is greatly improved, and the effectiveness of the robot for the dynamic tasks adopting the improved multi-stage auction algorithm is proved through simulation tests.
Finally, the method of the present application is only a preferred embodiment and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.