CN112948464B

Movatterモバイル変換

Info

Publication number: CN112948464B
Application number: CN202110237543.5A
Authority: CN
Inventors: 张晓琴
Original assignee: Chongqing Industry Polytechnic College
Current assignee: Chongqing Industry Polytechnic College
Priority date: 2021-03-04
Filing date: 2021-03-04
Publication date: 2021-09-17
Anticipated expiration: 2041-03-04
Also published as: CN112948464A

Abstract

Translated fromChinese

本发明公开了一种基于强化学习的避碰智能机器人，所述数据采集模块用于采集机器人的数据信息和周围的环境信息；所述定位模块用于获取机器人移动的坐标和障碍物的坐标；所述数据处理模块用于接收数据信息和环境信息进行处理并将其一同发送至数据分析模块；所述数据分析模块用于接收数据处理模块发送的数据并进行分析计算，得到前移排序集和障影排序集；所述统计预警模块用于接收前移排序集和障影排序集并进行统计和预警操作，所述调控模块用于对机器人的运行进行调控；本发明用于解决不能根据机器人的移动状态和障碍物状态进行综合分析为机器人的运行进行预警并及时进行学习和调整的问题。

The invention discloses a collision avoidance intelligent robot based on reinforcement learning. The data collection module is used for collecting data information of the robot and the surrounding environment information; the positioning module is used for obtaining the coordinates of the robot's movement and the coordinates of obstacles; The data processing module is used to receive the data information and the environmental information for processing and send them to the data analysis module together; the data analysis module is used to receive the data sent by the data processing module and perform analysis and calculation to obtain the advance sorting set and Obstruction sorting set; the statistical warning module is used to receive the advance sorting set and the obscuring sorting set and perform statistical and early warning operations, and the control module is used to control the operation of the robot; the present invention is used to solve the problem that the robot cannot The comprehensive analysis of the moving state and the obstacle state is the problem of early warning and timely learning and adjustment for the operation of the robot.

Description

Collision avoidance intelligent robot based on reinforcement learning

Technical Field

The invention relates to the technical field of intelligent robots, in particular to an intelligent robot for collision avoidance based on reinforcement learning.

Background

The intelligent robot at least has the following three elements: the first is a sensory element for recognizing the state of the surrounding environment; second, the movement element, make the responsive action to the outside world; third, the thinking element, which action is taken according to the information obtained by the feeling element. The sensory elements include non-contact sensors capable of sensing vision, proximity, distance, and the like, and contact sensors capable of sensing force, pressure, touch, and the like. These elements are substantially equivalent to five sense organs such as eyes, nose, ears and the like of a human, and the functions of the elements can be realized by using electromechanical components such as a camera, an image sensor, an ultrasonic transducer, a laser, conductive rubber, a piezoelectric element, a pneumatic element, a travel switch and the like;

the advanced intelligent robot has the capabilities of feeling, identifying, reasoning and judging, and can automatically modify programs within a certain range according to the change of external conditions. In contrast, the principle of modifying the program is not specified by a human, but rather the robot itself learns and summarizes the experience to obtain the principle of modifying the program.

The existing collision-prevention intelligent robot has the following defects: the problem that the robot cannot carry out comprehensive analysis to carry out early warning and timely learning and adjustment on the operation of the robot according to the moving state and the barrier state of the robot.

Disclosure of Invention

The invention aims to provide an intelligent collision avoidance robot based on reinforcement learning, and the technical problems to be solved by the invention are as follows:

how to solve can not carry out comprehensive analysis according to the mobile state of robot and barrier state among the current scheme and carry out the problem that early warning and in time study and adjustment are carried out to the operation of robot.

The purpose of the invention can be realized by the following technical scheme: an intelligent robot for avoiding collision based on reinforcement learning comprises a data acquisition module, a positioning module, a data processing module, a data analysis module, a statistic and early warning module and a regulation and control module;

the data acquisition module is used for acquiring data information of the robot and surrounding environment information, wherein the data information comprises size data, movement data and electric quantity data of the robot; the environment information comprises type data of the obstacles and contact data between the obstacles, and the data information and the environment information are sent to the data processing module;

the positioning module is used for acquiring the moving coordinates of the robot to obtain a first coordinate set, acquiring the coordinates of the obstacle to obtain a second coordinate set, classifying and combining the first coordinate set and the second coordinate set to obtain a coordinate information set, and sending the coordinate information set to the data analysis module;

the data processing module is used for receiving the data information and the environment information for processing to obtain size processing data, movement processing data, electric quantity processing data, type processing data and contact processing data, and sending the size processing data, the movement processing data, the electric quantity processing data, the type processing data and the contact processing data to the data analysis module;

the data analysis module is used for receiving size processing data, movement processing data, electric quantity processing data, type processing data, contact processing data and a coordinate information set, analyzing and calculating to obtain a forward movement sorting set and a barrier shadow sorting set;

the statistics early warning module is used for receiving the forward movement sorting set and the barrier shadow sorting set and carrying out statistics and early warning operation, and the specific steps comprise:

the method comprises the following steps: receive a set of forward ordered sets andthe barrier shadow sorting set marks a preset standard forward moving threshold as P1, marks a preset standard barrier shadow threshold as P2, and respectively matches the barrier shadow threshold with a forward moving value Q in the forward moving sorting set_qyBarrier value Q in barrier sorting set_zyCarrying out comparison and judgment;

step two: if Q_qyNot less than P1 and Q_zyIf the moving speed is more than or equal to P2, judging that the robot can move efficiently and can normally avoid the obstacle, and generating a first early warning signal; if Q_qy< P1 and Q_zyIf the distance is not less than P2, judging that the robot moves inefficiently and can normally avoid the obstacle, and generating a second early warning signal; if Q_qyNot less than P1 and Q_zyIf the number is less than P2, the robot is judged to be capable of efficiently moving but not capable of avoiding the obstacle, a third early warning signal is generated, and a forward shift value and an obstacle value corresponding to the third early warning signal are respectively marked as a first statistical forward shift value and a first statistical obstacle value; if Q_qy< P1 and Q_zyIf the number is less than P2, judging that the robot moves inefficiently and cannot avoid the obstacle, generating a fourth early warning signal, and marking a forward moving value and an obstacle value corresponding to the fourth early warning signal as a second statistical forward moving value and a second statistical obstacle value respectively;

step three: sending the first statistical forward shift value, the first statistical barrier shadow value, the second statistical forward shift value and the second statistical barrier shadow value to a regulation module;

the regulation and control module is used for regulating and controlling the operation of the robot.

Preferably, the specific steps of the data processing module for receiving and processing the data information and the environment information include:

s21: receiving the data information and the environment information, and acquiring size data, movement data and electric quantity data of the robot in the data information;

s22: the largest width in the size data is set as the first measurement and labeled YCi, i ═ 1,2,3.. n; the largest thickness in the dimensional data was set as the second measurement and labeled ECi, i-1, 2,3.. n; setting the height in the dimensional data as a third measurement and marking it as SCi, i-1, 2,3.. n; carrying out normalization processing on the marked first measurement value, the marked second measurement value and the marked third measurement value, and carrying out value combination to obtain size processing data;

s23: setting the maximum moving speed in the moving data as moving upper limit data, and marking the moving upper limit data as YSi, i is 1,2,3.. n; setting the maximum acceleration in the movement data as movement acceleration data and marking it as YJi, i ═ 1,2,3.. n; carrying out normalization processing on the marked movement upper limit data and the movement acceleration data and carrying out value combination to obtain movement processing data;

s24: marking real-time electric quantity in the electric quantity data as first electric measurement data, and marking the first electric measurement data as CDYi, i-1, 2,3.. n; marking standby electricity consumption data in the electricity quantity data as second electricity consumption data and marking the second electricity consumption data as CDEi, i is 1,2,3.. n; marking the mobile electricity consumption data in the electricity quantity data as third electricity consumption data and marking the third electricity consumption data as CDSi, wherein i is 1,2,3.. n; normalizing the marked first measured electrical data, the marked second measured electrical data and the marked third measured electrical data and carrying out value combination to obtain electrical quantity processing data;

s25: acquiring type data of obstacles in the environmental information and contact data between the obstacles;

s26: setting different obstacle types to correspond to different obstacle preset values, matching the obstacle types in the obstacle type data with all the obstacle types to obtain corresponding obstacle preset values, and marking the corresponding obstacle preset values as ZLIk, wherein i is 1,2,3.. n; k is 1, 2; carrying out normalization processing on a plurality of obstacle preset values and carrying out value combination to obtain type processing data; wherein ZLYik contains an obstacle preset value for a movable obstacle and an obstacle preset value for a non-movable obstacle;

s27: setting the space height in the relation data between the obstacles as first obstacle measurement data, and marking the first obstacle measurement data as YZCi, i-1, 2,3.. n; setting the maximum width of a space in the link data between the obstacles as second obstacle measurement data, and marking the second obstacle measurement data as EZCi, i-1, 2,3.. n; setting the minimum width of space in the relation data between obstacles as third obstacle measurement data, and marking the third obstacle measurement data as SZCi, i-1, 2,3.. n; and carrying out normalization processing on the marked first obstacle measurement data, the marked second obstacle measurement data and the marked third obstacle measurement data, and carrying out value combination to obtain the connection processing data.

Preferably, the specific steps of the data analysis module for performing the analysis operation include:

s31: acquiring size processing data, movement processing data, electric quantity processing data, type processing data, contact processing data and a coordinate information set which are subjected to normalization processing;

s32: acquiring a forward value of the robot movement by using a formula, wherein the formula is as follows:

wherein Q is_qyThe method comprises the steps of expressing the values as forward shift values, mu as preset forward shift correction factors, expressing a1 and a2 as different proportionality coefficients, expressing YSi as upper limit data of the shift, expressing CDYi as first electrical data, expressing CDEi as second electrical data, expressing CDSi as third electrical data, expressing t1 as the time length of standby power consumption of the robot, expressing t2 as the time length of power consumption of the robot in shifting, and expressing t3 as the time length of acceleration of the robot in shifting;

s33: carrying out descending order arrangement on the forward values to obtain a forward ordered set;

s34: according to the real-time coordinates of the robot movement in the first coordinate set in the coordinate information set and the coordinates of the obstacles in the second coordinate set, and the distance value between the real-time coordinates and the coordinates of the obstacles in the second coordinate set is obtained, and the distance value is marked as D1;

s35: obtaining the obstacle shadow value of the obstacle by using a formula, wherein the formula is as follows:

wherein Q is_zyExpressing as barrier shadow values, alpha is expressed as a preset barrier shadow correction factor, b1, b2, b3 and b4 are expressed as different scale factors, YCi is expressed as a first measured value, ECi is expressed as a second measured value, SCi is expressed as a third measured value, YZCi is expressed as first barrier measurement data, EZCi is expressed as second barrier measurement data, SZCi is expressed as third barrier measurement data, and ZLIk is expressed as a barrier preset value;

s36: and (4) carrying out descending order arrangement on the barrier shadow values to obtain a barrier shadow ordering set.

Preferably, the control module is used for controlling the operation of the robot, and the specific steps include:

s41: receiving a first statistical forward shift value, a first statistical barrier shadow value, a second statistical forward shift value and a second statistical barrier shadow value;

s42: acquiring a moving speed corresponding to a first statistical forward-moving value and marking the moving speed as a first early warning speed, acquiring real-time electric quantity corresponding to the first statistical forward-moving value and marking the real-time electric quantity as a first early warning electric quantity, acquiring contact data corresponding to a first statistical obstacle shadow value and marking the contact data as a first early warning size, and marking a distance value between the robot and the obstacle as a first early warning distance; controlling the moving speed and the moving direction of the robot when the robot meets an obstacle according to the first early warning distance, the first early warning size, the first early warning electric quantity and the first early warning speed;

s43: acquiring a moving speed corresponding to the second statistical forward-moving value and marking the moving speed as a second early warning speed, acquiring a real-time electric quantity corresponding to the second statistical forward-moving value and marking the real-time electric quantity as a second early warning electric quantity, acquiring contact data corresponding to the second statistical barrier shadow value and marking the contact data as a second early warning size, and marking a distance value between the robot and the barrier as a second early warning distance; and controlling the moving direction of the robot when the robot meets the obstacle according to the second early warning distance, the second early warning size, the second early warning electric quantity and the second early warning speed.

The invention has the beneficial effects that:

according to the various aspects disclosed by the invention, the purposes of carrying out comprehensive analysis according to the moving state and the barrier state of the robot to carry out early warning for the operation of the robot and timely learning and adjusting can be achieved by using the data acquisition module, the positioning module, the data processing module, the data analysis module, the statistic early warning module and the regulation and control module in a matching way;

the method comprises the steps that a data acquisition module is used for acquiring data information of the robot and surrounding environment information, wherein the data information comprises size data, movement data and electric quantity data of the robot; the environment information comprises type data of the obstacles and contact data between the obstacles, and the data information and the environment information are sent to the data processing module; by collecting data information of the robot and surrounding environment information and carrying out processing and analysis, effective data support is provided for collision avoidance, early warning learning and adjustment of the robot;

acquiring a coordinate of robot movement by using a positioning module to obtain a first coordinate set, acquiring a coordinate of an obstacle to obtain a second coordinate set, classifying and combining the first coordinate set and the second coordinate set to obtain a coordinate information set, and sending the coordinate information set to a data analysis module; by positioning the movement of the robot and the position of the obstacle, data support can be provided for the robot to change the running direction;

the data processing module is used for receiving the data information and the environment information for processing to obtain size processing data, movement processing data, electric quantity processing data, type processing data and connection processing data, and the size processing data, the movement processing data, the electric quantity processing data, the type processing data and the connection processing data are sent to the data analysis module; by processing the data information and the environmental information in comparison with each other, the relationship between the data items is conveniently established, and the data processing efficiency and the processing accuracy are improved;

receiving size processing data, movement processing data, electric quantity processing data, type processing data, contact processing data and a coordinate information set by using a data analysis module, and carrying out analysis calculation to obtain a forward movement sorting set and an obstacle shadow sorting set; the forward movement value and the barrier shadow are obtained by calculating and establishing a relation between the processed data, so that the movement of the robot and the state between the barriers can be conveniently analyzed;

the intelligent robot system comprises a front moving sorting set, a barrier shadow sorting set, a regulation and control module, a warning module and a warning module, wherein the front moving sorting set and the barrier shadow sorting set are received by the statistics and warning module, the regulation and control module is used for regulating and controlling the operation of the robot, and the data after collision are subjected to statistics and warning, so that the purpose of intelligent learning is achieved.

Drawings

The invention will be further described with reference to the accompanying drawings.

Fig. 1 is a block diagram of an intelligent collision avoidance robot based on reinforcement learning.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Referring to fig. 1, the invention relates to an intelligent robot for collision avoidance based on reinforcement learning, which comprises a data acquisition module, a positioning module, a data processing module, a data analysis module, a statistical early warning module and a regulation and control module;

the data processing module is used for receiving the data information and the environment information for processing to obtain size processing data, movement processing data, electric quantity processing data, type processing data and contact processing data, and sending the size processing data, the movement processing data, the electric quantity processing data, the type processing data and the contact processing data to the data analysis module; the data processing module is used for receiving the data information and the environment information for processing, and comprises the following specific steps:

receiving the data information and the environment information, and acquiring size data, movement data and electric quantity data of the robot in the data information;

the largest width in the size data is set as the first measurement and labeled YCi, i ═ 1,2,3.. n; the largest thickness in the dimensional data was set as the second measurement and labeled ECi, i-1, 2,3.. n; setting the height in the dimensional data as a third measurement and marking it as SCi, i-1, 2,3.. n; carrying out normalization processing on the marked first measurement value, the marked second measurement value and the marked third measurement value, and carrying out value combination to obtain size processing data;

setting the maximum moving speed in the moving data as moving upper limit data, and marking the moving upper limit data as YSi, i is 1,2,3.. n; setting the maximum acceleration in the movement data as movement acceleration data and marking it as YJi, i ═ 1,2,3.. n; carrying out normalization processing on the marked movement upper limit data and the movement acceleration data and carrying out value combination to obtain movement processing data;

marking real-time electric quantity in the electric quantity data as first electric measurement data, and marking the first electric measurement data as CDYi, i-1, 2,3.. n; marking standby electricity consumption data in the electricity quantity data as second electricity consumption data and marking the second electricity consumption data as CDEi, i is 1,2,3.. n; marking the mobile electricity consumption data in the electricity quantity data as third electricity consumption data and marking the third electricity consumption data as CDSi, wherein i is 1,2,3.. n; normalizing the marked first measured electrical data, the marked second measured electrical data and the marked third measured electrical data and carrying out value combination to obtain electrical quantity processing data;

acquiring type data of obstacles in the environmental information and contact data between the obstacles;

setting different obstacle types to correspond to different obstacle preset values, matching the obstacle types in the obstacle type data with all the obstacle types to obtain corresponding obstacle preset values, and marking the corresponding obstacle preset values as ZLIk, wherein i is 1,2,3.. n; k is 1, 2; carrying out normalization processing on a plurality of obstacle preset values and carrying out value combination to obtain type processing data; wherein ZLYik contains an obstacle preset value for a movable obstacle and an obstacle preset value for a non-movable obstacle;

setting the space height in the relation data between the obstacles as first obstacle measurement data, and marking the first obstacle measurement data as YZCi, i-1, 2,3.. n; setting the maximum width of a space in the link data between the obstacles as second obstacle measurement data, and marking the second obstacle measurement data as EZCi, i-1, 2,3.. n; setting the minimum width of space in the relation data between obstacles as third obstacle measurement data, and marking the third obstacle measurement data as SZCi, i-1, 2,3.. n; carrying out normalization processing on the marked first obstacle measurement data, the marked second obstacle measurement data and the marked third obstacle measurement data, and carrying out value combination to obtain connection processing data;

the data analysis module is used for receiving size processing data, movement processing data, electric quantity processing data, type processing data, contact processing data and a coordinate information set, analyzing and calculating to obtain a forward movement sorting set and a barrier shadow sorting set; the specific steps of the data analysis module for performing analysis operation comprise:

acquiring size processing data, movement processing data, electric quantity processing data, type processing data, contact processing data and a coordinate information set which are subjected to normalization processing;

acquiring a forward value of the robot movement by using a formula, wherein the formula is as follows:

carrying out descending order arrangement on the forward values to obtain a forward ordered set;

according to the real-time coordinates of the robot movement in the first coordinate set in the coordinate information set and the coordinates of the obstacles in the second coordinate set, and the distance value between the real-time coordinates and the coordinates of the obstacles in the second coordinate set is obtained, and the distance value is marked as D1;

obtaining the obstacle shadow value of the obstacle by using a formula, wherein the formula is as follows:

carrying out descending order arrangement on the barrier shadow values to obtain a barrier shadow ordering set;

the method comprises the following steps: receiving the forward-shift sorting set and the barrier sorting set, marking a preset standard forward-shift threshold as P1, marking a preset standard barrier threshold as P2, and respectively comparing the standard forward-shift threshold with a forward-shift value Q in the forward-shift sorting set_qyBarrier value Q in barrier sorting set_zyCarrying out comparison and judgment;

the regulation and control module is used for regulating and controlling the operation of the robot, and the specific steps comprise:

receiving a first statistical forward shift value, a first statistical barrier shadow value, a second statistical forward shift value and a second statistical barrier shadow value;

acquiring a moving speed corresponding to a first statistical forward-moving value and marking the moving speed as a first early warning speed, acquiring real-time electric quantity corresponding to the first statistical forward-moving value and marking the real-time electric quantity as a first early warning electric quantity, acquiring contact data corresponding to a first statistical obstacle shadow value and marking the contact data as a first early warning size, and marking a distance value between the robot and the obstacle as a first early warning distance; controlling the moving speed and the moving direction of the robot when the robot meets an obstacle according to the first early warning distance, the first early warning size, the first early warning electric quantity and the first early warning speed;

acquiring a moving speed corresponding to the second statistical forward-moving value and marking the moving speed as a second early warning speed, acquiring a real-time electric quantity corresponding to the second statistical forward-moving value and marking the real-time electric quantity as a second early warning electric quantity, acquiring contact data corresponding to the second statistical barrier shadow value and marking the contact data as a second early warning size, and marking a distance value between the robot and the barrier as a second early warning distance; controlling the moving direction of the robot when the robot meets the obstacle according to the second early warning distance, the second early warning size, the second early warning electric quantity and the second early warning speed;

the above formulas are obtained by collecting a large amount of data and performing software simulation, and the coefficients in the formulas are set by those skilled in the art according to actual conditions.

The working principle of the invention is as follows: in the embodiment of the invention, the data acquisition module, the positioning module, the data processing module, the data analysis module, the statistic and early warning module and the regulation and control module are used in a matched manner, so that the aims of carrying out comprehensive analysis on the moving state and the barrier state of the robot to carry out early warning on the operation of the robot and timely learning and adjusting can be achieved;

receiving size processing data, mobile processing data, electric quantity processing data, type processing data, contact processing data and coordinate information set by using a data analysis module, analyzing and calculating, and using a formula

Acquiring a forward value of the movement of the robot; carrying out descending order arrangement on the forward values to obtain a forward ordered set; using formulas

Acquiring a barrier shadow value of a barrier; carrying out descending order arrangement on the barrier shadow values to obtain a barrier shadow ordering set; the forward movement value and the barrier shadow are obtained by calculating and establishing a relation between the processed data, so that the movement of the robot and the state between the barriers can be conveniently analyzed;

In the embodiments provided by the present invention, it should be understood that the disclosed system and method can be implemented in other ways. For example, the above-described embodiments are merely illustrative, and for example, the division of the modules is only one logical functional division, and other divisions may be realized in practice.

The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical modules, may be located in one place, or may be distributed on a plurality of network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the method of the embodiment.

In addition, each functional module in each embodiment of the present invention may be integrated into one control module, or each module may exist alone physically, or two or more modules may be integrated into one module. The integrated module can be realized in a hardware form, and can also be realized in a form of hardware and a software functional module.

It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof.

The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.

Furthermore, it is to be understood that the word "comprising" does not exclude other modules or steps, and the singular does not exclude the plural. A plurality of modules or means recited in the system claims may also be implemented by one module or means in software or hardware. The terms second, etc. are used to denote names, but not any particular order.

Finally, it should be noted that the above examples are only intended to illustrate the technical process of the present invention and not to limit the same, and although the present invention has been described in detail with reference to the preferred embodiments, it will be understood by those skilled in the art that modifications or equivalent substitutions may be made to the technical process of the present invention without departing from the spirit and scope of the technical process of the present invention.

Claims

Translated fromChinese

1.一种基于强化学习的避碰智能机器人，其特征在于，包括数据采集模块、定位模块、数据处理模块、数据分析模块、统计预警模块和调控模块；1. a collision avoidance intelligent robot based on reinforcement learning, is characterized in that, comprises data acquisition module, positioning module, data processing module, data analysis module, statistical early warning module and control module;

所述数据采集模块用于采集机器人的数据信息和周围的环境信息，该数据信息包含机器人的尺寸数据、移动数据和电量数据；该环境信息包含障碍物的类型数据和障碍物之间的联系数据，将数据信息和环境信息发送至数据处理模块；The data collection module is used to collect data information of the robot and surrounding environment information, the data information includes the size data, movement data and power data of the robot; the environment information includes the type data of the obstacles and the connection data between the obstacles , send data information and environmental information to the data processing module;

所述定位模块用于获取机器人移动的坐标得到第一坐标集，获取障碍物的坐标得到第二坐标集，将第一坐标集和第二坐标集分类组合，得到坐标信息集，将坐标信息集发送至数据分析模块；The positioning module is used to obtain the coordinates of the movement of the robot to obtain the first coordinate set, obtain the coordinates of the obstacle to obtain the second coordinate set, classify and combine the first coordinate set and the second coordinate set to obtain the coordinate information set, and combine the coordinate information set. sent to the data analysis module;

所述数据处理模块用于接收数据信息和环境信息进行处理，得到尺寸处理数据、移动处理数据、电量处理数据、类型处理数据和联系处理数据，并将其一同发送至数据分析模块；The data processing module is used to receive data information and environmental information for processing, obtain size processing data, movement processing data, power processing data, type processing data and contact processing data, and send them to the data analysis module together;

所述数据分析模块用于接收尺寸处理数据、移动处理数据、电量处理数据、类型处理数据和联系处理数据以及坐标信息集并进行分析计算，得到前移排序集和障影排序集；The data analysis module is used to receive size processing data, movement processing data, power processing data, type processing data, contact processing data, and coordinate information sets, and perform analysis and calculation to obtain advance sorting sets and shading sorting sets;

所述统计预警模块用于接收前移排序集和障影排序集并进行统计和预警操作，具体的步骤包括：The statistical early warning module is used for receiving the advance sorting set and the obscuring sorting set and performing statistics and early warning operations, and the specific steps include:

步骤一：接收前移排序集和障影排序集，将预设的标准前移阈值标记为P1，将预设的标准障影阈值标记为P2，分别将其与前移排序集中的前移值Q_qy和障影排序集中的障影值Q_zy进行对比判断；Step 1: Receive the advance sorting set and the obscurity sorting set, mark the preset standard advance threshold as P1, mark the preset standard obscurity threshold as P2, and respectively mark them with the advance value in the advance sorting set Q_qy is compared and judged with the shadow value Q_zy in the shadow sorting set;

步骤二：若Q_qy≥P1且Q_zy≥P2，则判定机器人可以高效移动并且可以正常避开障碍物，生成第一预警信号；若Q_qy＜P1且Q_zy≥P2，则判定机器人低效移动并且可以正常避开障碍物，生成第二预警信号；若Q_qy≥P1且Q_zy＜P2，则判定机器人可以高效移动但不能避开障碍物，生成第三预警信号，并将第三预警信号对应的前移值和障影值分别标记为第一统计前移值和第一统计障影值；若Q_qy＜P1且Q_zy＜P2，则判定机器人低效移动并且不能避开障碍物，生成第四预警信号，并将第四预警信号对应的前移值和障影值分别标记为第二统计前移值和第二统计障影值；Step 2: If Q_qy ≥ P1 and Q_zy ≥ P2, it is determined that the robot can move efficiently and avoid obstacles normally, and the first warning signal is generated; if Q_qy <P1 and Q_zy ≥ P2, it is determined that the robot is inefficient. Move and can avoid obstacles normally, and generate a second early warning signal; if Q_qy ≥ P1 and Q_zy < P2, it is determined that the robot can move efficiently but cannot avoid obstacles, generate a third early warning signal, and send the third early warning signal. The forward movement value and the shadow value corresponding to the signal are marked as the first statistical forward movement value and the first statistical shadow value respectively; if Q_qy < P1 and Q_zy < P2, it is determined that the robot moves inefficiently and cannot avoid obstacles , generating a fourth early warning signal, and marking the advance value and the shadow value corresponding to the fourth early warning signal as the second statistical advance value and the second statistical shadow value;

步骤三：将第一统计前移值和第一统计障影值以及第二统计前移值和第二统计障影值发送至调控模块；Step 3: sending the first statistical advance value, the first statistical shadow value and the second statistical advance value and the second statistical shadow value to the control module;

所述调控模块用于对机器人的运行进行调控。The control module is used to control the operation of the robot.

2.根据权利要求1所述的一种基于强化学习的避碰智能机器人，其特征在于，所述数据处理模块用于接收数据信息和环境信息进行处理的具体步骤包括：2. a kind of collision avoidance intelligent robot based on reinforcement learning according to claim 1, is characterized in that, the concrete steps that described data processing module is used for receiving data information and environmental information and processing comprises:

S21：接收数据信息和环境信息，获取数据信息中机器人的尺寸数据、移动数据和电量数据；S21: Receive data information and environmental information, and obtain the size data, movement data and power data of the robot in the data information;

S22：将尺寸数据中最大的宽度设定为第一测量值，并将其标记为YCi,i＝1,2,3...n；将尺寸数据中最大的厚度设定为第二测量值，并将其标记为ECi,i＝1,2,3...n；将尺寸数据中的高度设定为第三测量值，并将其标记为SCi,i＝1,2,3...n；将标记的第一测量值、第二测量值和第三测量值进行归一化处理并取值组合，得到尺寸处理数据；S22: Set the largest width in the dimension data as the first measurement value, and mark it as YCi, i=1, 2, 3...n; set the largest thickness in the dimension data as the second measurement value , and label it as ECi,i=1,2,3...n; set the height in the dimension data as the third measurement and label it as SCi,i=1,2,3.. .n; the marked first measurement value, the second measurement value and the third measurement value are normalized and combined to obtain size processing data;

S23：将移动数据中的最大移动速率设定为移动上限数据，并将其标记为YSi,i＝1,2,3...n；将移动数据中的最大加速度设定为移动加速数据，并将其标记为YJi,i＝1,2,3...n；将标记的移动上限数据和移动加速数据进行归一化处理并取值组合，得到移动处理数据；S23: Set the maximum movement rate in the movement data as the movement upper limit data, and mark it as YSi, i=1, 2, 3...n; set the maximum acceleration in the movement data as the movement acceleration data, and mark it as YJi,i=1,2,3...n; normalize the marked movement upper limit data and movement acceleration data and combine the values to obtain the movement processing data;

S24：将电量数据中的实时电量标记为第一测电数据，并将其标记为CDYi,i＝1,2,3...n；将电量数据中的待机耗电数据标记为第二测电数据，并将其标记为CDEi,i＝1,2,3...n；将电量数据中的移动耗电数据标记为第三测电数据，并将其标记为CDSi,i＝1,2,3...n；将标记的第一测电数据、第二测电数据和第三测电数据进行归一化处理并取值组合，得到电量处理数据；S24: Mark the real-time power in the power data as the first power measurement data, and mark it as CDYi, i=1, 2, 3...n; mark the standby power consumption data in the power data as the second test electricity data, and mark it as CDEi, i=1, 2, 3...n; mark the mobile power consumption data in the electricity data as the third electricity measurement data, and mark it as CDSi, i=1, 2,3...n; normalize the marked first electricity measurement data, the second electricity measurement data and the third electricity measurement data, and combine the values to obtain electricity processing data;

S25：获取环境信息中障碍物的类型数据和障碍物之间的联系数据；S25: Obtain the type data of obstacles in the environmental information and the connection data between obstacles;

S26：设定不同的障碍物类型对应不同的障碍预设值，将障碍物的类型数据中的障碍物类型与所有的障碍物类型进行匹配获取对应的障碍预设值并将其标记为ZLYik,i＝1,2,3...n；k＝1,2；将若干个障碍预设值进行归一化处理并取值组合，得到类型处理数据；其中，ZLYik包含可移动障碍物的障碍预设值和不可移动障碍物的障碍预设值；S26: Set different obstacle types to correspond to different obstacle preset values, match the obstacle type in the obstacle type data with all obstacle types to obtain the corresponding obstacle preset value and mark it as ZLYik, i=1,2,3...n; k=1,2; normalize and combine several obstacle preset values to obtain type processing data; wherein, ZLYik includes obstacles of movable obstacles Presets and obstacle presets for immovable obstacles;

S27：将障碍物之间的联系数据中的空间高度设定为第一障测数据，并将其标记为YZCi,i＝1,2,3...n；将障碍物之间的联系数据中的空间最大宽度设定为第二障测数据，并将其标记为EZCi,i＝1,2,3...n；将障碍物之间的联系数据中的空间最小宽度设定为第三障测数据，并将其标记为SZCi,i＝1,2,3...n；将标记的第一障测数据、第二障测数据和第三障测数据进行归一化处理并取值组合，得到联系处理数据。S27: Set the spatial height in the connection data between obstacles as the first obstacle measurement data, and mark it as YZCi, i=1, 2, 3...n; set the connection data between obstacles Set the maximum width of space in the data as the second obstacle measurement data, and mark it as EZCi, i=1, 2, 3...n; set the minimum width of space in the connection data between obstacles as the first The three obstacle measurement data are marked as SZCi,i=1,2,3...n; the marked first obstacle measurement data, the second obstacle measurement data and the third obstacle measurement data are normalized and processed. Value combination to get contact processing data.

3.根据权利要求1所述的一种基于强化学习的避碰智能机器人，其特征在于，所述数据分析模块进行分析操作的具体步骤包括：3. a kind of collision avoidance intelligent robot based on reinforcement learning according to claim 1, is characterized in that, the concrete step that described data analysis module carries out analysis operation comprises:

S31：获取归一化处理的尺寸处理数据、移动处理数据、电量处理数据、类型处理数据和联系处理数据以及坐标信息集；S31: Acquire normalized size processing data, movement processing data, power processing data, type processing data, contact processing data, and a coordinate information set;

S32：利用公式获取机器人移动的前移值，该公式为：S32: Use the formula to obtain the forward movement value of the robot movement, the formula is:

其中，Q_qy表示为前移值，μ表示为预设的前移修正因子，a1、a2表示为不同的比例系数，YSi表示为移动上限数据，CDYi表示为第一测电数据，CDEi表示为第二测电数据，CDSi表示为第三测电数据，t1表示为机器人待机耗电的时长，t2表示为机器人移动耗电的时长，t3表示为机器人移动加速的时长；Among them, Q_qy represents the advance value, μ represents the preset advance correction factor, a1, a2 represent different scale coefficients, YSi represents the upper limit data of the movement, CDYi represents the first electrical measurement data, and CDEi represents the The second power measurement data, CDSi is the third power measurement data, t1 is the duration of the robot's standby power consumption, t2 is the duration of the robot's moving power consumption, and t3 is the duration of the robot's movement acceleration;

S33：将前移值进行降序排列得到前移排序集；S33: Arrange the forward values in descending order to obtain a forward sorted set;

S34：根据坐标信息集中第一坐标集的机器人移动的实时坐标和第二坐标集中障碍物的坐标并获取之间的距离值将其标记为D1；S34: Mark it as D1 according to the distance value between the real-time coordinates of the robot movement in the first coordinate set in the coordinate information set and the coordinates of the obstacle in the second coordinate set and the obtained distance value;

S35：利用公式获取障碍物的障影值，该公式为：S35: Use the formula to obtain the shadow value of the obstacle, the formula is:

其中，Q_zy表示为障影值，α表示为预设的障影修正因子，b1、b2、b3、b4表示为不同的比例系数，YCi表示为第一测量值，ECi表示为第二测量值，SCi表示为第三测量值，YZCi表示为第一障测数据，EZCi表示为第二障测数据，SZCi表示为第三障测数据，ZLYik表示为障碍预设值；Among them,_Qzy is the shadow value, α is the preset shadow correction factor, b1, b2, b3, and b4 are different proportional coefficients, YCi is the first measurement value, and ECi is the second measurement value. , SCi represents the third measurement value, YZCi represents the first obstacle measurement data, EZCi represents the second obstacle measurement data, SZCi represents the third obstacle measurement data, and ZLYik represents the obstacle preset value;

S36：将障影值进行降序排列得到障影排序集。S36: Arrange the shading values in descending order to obtain a shading sorted set.

4.根据权利要求1所述的一种基于强化学习的避碰智能机器人，其特征在于，所述调控模块用于对机器人的运行进行调控，具体的步骤包括：4. a kind of collision avoidance intelligent robot based on reinforcement learning according to claim 1, is characterized in that, described regulation module is used to regulate the operation of robot, and concrete steps comprise:

S41：接收第一统计前移值和第一统计障影值以及第二统计前移值和第二统计障影值；S41: Receive the first statistical advance value and the first statistical shadow value and the second statistical advance value and the second statistical shadow value;

S42：获取第一统计前移值对应的移动速度并将其标记为第一预警速度，获取第一统计前移值对应的实时电量并将其标记为第一预警电量，获取第一统计障影值对应的联系数据并将其标记为第一预警尺寸，并将机器人与障碍物之间的距离值标记为第一预警距离；根据第一预警距离、第一预警尺寸、第一预警电量和第一预警速度对机器人遇见障碍物时控制机器人的移动速度和移动方向；S42: Obtain the moving speed corresponding to the first statistical advance value and mark it as the first early warning speed, obtain the real-time power corresponding to the first statistical advance value and mark it as the first early warning power, and obtain the first statistical shadow The contact data corresponding to the value is marked as the first warning size, and the distance value between the robot and the obstacle is marked as the first warning distance; according to the first warning distance, the first warning size, the first warning power and the first warning distance The early warning speed controls the moving speed and moving direction of the robot when the robot encounters an obstacle;

S43：获取第二统计前移值对应的移动速度并将其标记为第二预警速度，获取第二统计前移值对应的实时电量并将其标记为第二预警电量，获取第二统计障影值对应的联系数据并将其标记为第二预警尺寸，并将机器人与障碍物之间的距离值标记为第二预警距离；根据第二预警距离、第二预警尺寸、第二预警电量和第二预警速度对机器人遇见障碍物时控制机器人的移动方向。S43: Obtain the moving speed corresponding to the second statistical advance value and mark it as the second early warning speed, obtain the real-time power corresponding to the second statistical advance value and mark it as the second early warning power, and obtain the second statistical shadow The contact data corresponding to the value is marked as the second warning size, and the distance value between the robot and the obstacle is marked as the second warning distance; according to the second warning distance, the second warning size, the second warning power and the first warning 2. The early warning speed controls the moving direction of the robot when the robot encounters an obstacle.