Movatterモバイル変換


[0]ホーム

URL:


CN119756364B - A local anti-pursuit method for surface unmanned vehicles based on aggregation of spatiotemporal features - Google Patents

A local anti-pursuit method for surface unmanned vehicles based on aggregation of spatiotemporal features

Info

Publication number
CN119756364B
CN119756364BCN202411799637.1ACN202411799637ACN119756364BCN 119756364 BCN119756364 BCN 119756364BCN 202411799637 ACN202411799637 ACN 202411799637ACN 119756364 BCN119756364 BCN 119756364B
Authority
CN
China
Prior art keywords
unmanned
pursuit
obstacles
ship
local
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202411799637.1A
Other languages
Chinese (zh)
Other versions
CN119756364A (en
Inventor
陈熙源
姚志婷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southeast University
Original Assignee
Southeast University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southeast UniversityfiledCriticalSoutheast University
Priority to CN202411799637.1ApriorityCriticalpatent/CN119756364B/en
Publication of CN119756364ApublicationCriticalpatent/CN119756364A/en
Application grantedgrantedCritical
Publication of CN119756364BpublicationCriticalpatent/CN119756364B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Classifications

Landscapes

Abstract

Translated fromChinese

本发明公开了一种聚合时空特征的水面无人艇局部抗追击方法。现阶段的水面无人艇局部导航算法主要针对静态和匀速直线运动的动态障碍物开展局部避碰路径规划,因操作失控或恶意拦堵等行为,无人艇可能会遭遇追击情况,这对其航行安全构成较大威胁。本发明提出了一种适用于复杂水面环境下的抗追击路径规划方法,包括:模拟多源船载传感器生成仿真信号提供环境感知数据,构建聚合时空特征的编码结构区分场景中动态和静态障碍物,采用值分布式强化学习算法搭建端到端导航路径规划模型。本发明可识别环境中障碍物运动特性,从而增强无人艇在复杂环境下的决策能力与航行鲁棒性。

The present invention discloses a local anti-pursuit method for surface unmanned boats that aggregates spatiotemporal features. The current local navigation algorithm for surface unmanned boats mainly carries out local collision avoidance path planning for static and uniform linear motion dynamic obstacles. Due to loss of control or malicious blocking and other behaviors, the unmanned boat may encounter pursuit, which poses a great threat to its navigation safety. The present invention proposes an anti-pursuit path planning method suitable for complex water surface environments, including: simulating multi-source shipborne sensors to generate simulation signals to provide environmental perception data, constructing a coding structure that aggregates spatiotemporal features to distinguish between dynamic and static obstacles in the scene, and using a value-distributed reinforcement learning algorithm to build an end-to-end navigation path planning model. The present invention can identify the motion characteristics of obstacles in the environment, thereby enhancing the decision-making ability and navigation robustness of the unmanned boat in complex environments.

Description

Method for preventing topsides unmanned surface vehicle from being partially pursued by polymerizing space-time characteristics
Technical Field
The invention relates to the field of intelligent unmanned ships, in particular to a method for locally sailing an end-to-end unmanned ship by aggregating space-time characteristics, which comprises the steps of extracting environment observation information by using a space-time characteristic coding structure as a deep reinforcement learning algorithm input, and guiding the unmanned ship to autonomously distribute weight information to realize local obstacle avoidance.
Background
The unmanned surface vessel is always the research focus in the intelligent shipping field of intelligent ships in recent years, and the autonomous navigation capacity of the ship is improved, and the unmanned surface vessel cannot leave from the real-time path planning capacity and the obstacle collision prevention capacity with high reliability. In the field of marine vessel path planning, a great deal of emerging research has emerged over the last five years, focusing on designing different voyage scenarios to verify the path planning capabilities of decision-making algorithms. And (3) global path planning, wherein researchers acquire static environment characteristics through information such as electronic chart and the like, and plan a route pointing from a starting point to an end point. The algorithm only needs to carry out track adjustment when the ship deviates from the preset path to a large extent, and the planned path does not need to be updated in real time. And (3) local path planning, namely guiding the change of the speed and the course angle of the ship at the next moment through surrounding obstacle information acquired in real time, wherein the algorithm needs to sense the environment information in real time and make reasonable motion decisions.
The invention focuses on the field of local path planning and discusses a local anti-collision method of an unmanned ship under the complex vortex interference environment such as a high-dynamic chasing person, a high-density obstacle and the like. And (3) researching an end-to-end robust ship motion decision technology, and realizing real-time high-reliability autonomous collision avoidance and navigation planning of the unmanned ship under a scene without priori information.
Disclosure of Invention
The invention aims to provide a robust navigation planning method of a water surface unmanned ship for converging space-time characteristics for realizing real-time collision avoidance of ships on various movement mode obstacles aiming at a high-dynamic chaser, a high-density obstacle and vortex interference water surface environment in the local path planning direction related to the background technology.
The invention adopts a method for preventing the rear-end collision of the unmanned surface vehicle with polymerized space-time characteristics, which comprises the following steps:
step 1, defining a local navigation area range, setting starting point coordinates and end point coordinates of a water surface unmanned ship in the area, and randomly generating vortex signals in the area;
Step 2, constructing a navigation environment, and randomly generating static obstacles with different sizes and dynamic obstacles with random positions in a blank area;
Step 3, generating simulation signals for providing speed, ship self-position and obstacle relative distance information based on the type of a sensor carried by a real ship, and constructing observation signals according to the simulation signals;
step 4, sending the observation signals into a space-time feature aggregation module, and aggregating space features in an inter-frame stacking mode along a time dimension;
Step 5, determining a distributed reinforcement learning algorithm as a backbone frame of a navigation planning network, and defining an action space and a reward function;
Step 6, collecting experience bar data in the training process by adopting an experience pool, optimizing network weight parameters by small batches of experience bars, and training a deep reinforcement learning model by using an experience playback strategy until convergence;
And 7, constructing an experimental verification scene, loading converged model parameters, and verifying the anti-collision capability of the ship in a local environment and the generalization capability of the ship in response to a new scene.
The local navigation area is a simulation environment built for verifying the effectiveness of the algorithm.
The space-time feature aggregation module is formed by connecting an independent space feature aggregation module and a time feature extraction module in series.
The step 1 specifically comprises the following steps:
1-1, defining a local sailing area range of the unmanned surface vehicle, and setting a length value as H and a width value as W;
1-2, initializing a starting coordinate PSTA (x, y) and an ending coordinate PEND (x, y) of the unmanned surface vessel, wherein the unmanned surface vessel has a coordinate P (x, y) at each moment in the range of the current local navigation area.
The step 2 specifically comprises the following steps:
2-1 determining a random operator, and randomly initializing the position coordinates of the static obstacle in a blank areaRadius sizeSi represents the ith static obstacle;
2-2 for dynamic obstacle, randomly generating its initial position coordinates on the remaining blank areaThe terminal point is set as a position coordinate P (x, y) of the unmanned surface vessel at the current moment, namely the terminal point is continuously changed along with the movement of the unmanned surface vessel, and dj represents the j-th dynamic obstacle.
The step 3 comprises the following steps:
3-1, simulating the working principle and the output characteristic of each sensor based on the type of the ship-borne sensor, and generating an observation signal at the current moment;
3-2, generating satellite and inertial device signals, and providing the self position P (x, y) of the ship;
and 3-3, generating Doppler velocimeter signals through simulation, and providing the current ground speed V (Vx,vy) of the ship.
3-4, Simulating to generate a laser radar signal, and providing relative distance information D (D1,d2,…,dN) of the ship relative to obstacles in the surrounding environment, wherein N is the number of laser beams generated when simulating the laser radar signal;
And 3-5, splicing the position information, the speed information and the obstacle relative distance information of the ship, and using the spliced information as an observation signal st={P(x,y),V(vx,vy),D(d1,d2,…,dN under the current environment.
The step 4 comprises the following steps:
4-1, designing a stack structure aiming at the observation signals St spliced in the step 3, stacking the signals along the time dimension, constructing spatial feature information St={st-k,…,st-1,st which is rich in the spatial position change of the obstacle, wherein St={P(x,y),V(vx,vy),D(d1,d2,…,dN), and extracting motion features of the spatial feature information by adopting a rectangular convolutional neural network.
4-2, On the basis of performing feature aggregation on the space feature information by adopting a neural network, further extracting the motion dependency characteristic in the space feature information by adopting a continuous time neural network to acquire the identification feature of the environment observation signal in the time dimension, wherein the continuous time neural network is formed by the sub-networks { f, g and h } together, and the three sub-networks share a basic skeleton network structure.
Said step 5 comprises the steps of:
5-1, determining a value distributed reinforcement learning (IQN) algorithm as a network framework, and guiding a decision by using the reporting distribution information by modeling the distribution information of accumulated returns, so as to avoid ignoring risk changes in a scene by using reporting average information in all states;
5-2 defining the motion space as the acceleration and angular velocity changes in discrete spaceAlpha is the acceleration rate of the vehicle,Is the angular velocity;
5-3 determining a reward function rt, defining an arrival endpoint reward factor and a collision penalty factor with the obstacle, defining a time penalty factor, and defining an endpoint g-directed reward function at the current time t
The step 6 comprises the following steps:
6-1, initializing a deep reinforcement learning network, initializing an input state st, selecting an action at under the current input state, executing the action to obtain a current moment rewarding value rt and a next moment state st+1, and storing an experience bar (st,at,rt,st+1) in an experience playback pool.
And 6-2, determining the total number of experience playback pools, and when the accumulated training experience bars exceed the total number, extracting experience bar data in small batches, solving gradients by adopting a gradient descent algorithm in the back propagation process, and finishing updating network parameters until the model converges.
The step 7 comprises the following steps:
7-1, the properties of the vortex, the static obstacle and the dynamic obstacle in the experimental scene are consistent with those in the training scene, but the positions and the numbers of the vortex, the static obstacle and the dynamic obstacle are randomly initialized according to the experimental requirements.
Compared with the prior art, the invention has the following advantages that:
1. The invention provides a water surface unmanned ship local anti-collision method with aggregated space-time characteristics, which is characterized in that environmental data are collected from a sensor to an unmanned ship navigation planning decision signal, and an end-to-end unmanned ship local path planning network is constructed. The network structure fully extracts the spatial position change of the continuous observation signal, and further extracts the long-time motion dependency characteristic of the obstacle contained in the characteristic through a continuous time network. Therefore, the local collision prevention and anti-collision navigation route planning under the high-risk and obstacle-dense scene is realized.
2. The invention fully considers the water surface environment and the characteristics of the ship-borne sensor, and adds vortex interference to simulate the influence of the water surface dynamic environment on the ship speed. The output characteristics of the simulation ship-borne sensor generate environment observation data, and the simulation result is ensured to be consistent with the front-end perception input of the ship in the real environment. Compared with the prior algorithm which relies on the obstacle position and length information to calculate the position, the method has the advantages that the update frequency of sensing data sources such as the laser radar is higher, the error transmitted in the obstacle position calculating process is smaller, and the method is favorable for generating a decision signal with stronger robustness and higher real-time performance.
Drawings
FIG. 1 is a flow chart of a method for partially resisting rear-end collision of a water surface unmanned ship with aggregated space-time characteristics.
FIG. 2 is a schematic diagram of the simulated shipboard sensor build environment observation state vector of the present invention.
FIG. 3 is a diagram of a distributed reinforcement learning network architecture for aggregating spatiotemporal features of the present invention.
Fig. 4 is a view showing the effect of the present invention on the local anti-collision and navigation path planning.
Detailed Description
The technical scheme of the invention is further described in detail below with reference to the accompanying drawings:
As shown in fig. 1, the invention discloses a local anti-collision method for a water surface unmanned ship with aggregated space-time characteristics, which comprises the steps of generating environment observation signals under multiple information sources by simulating the working principle and the output state of a sensor in a local path planning process, and stacking the environment observation along the time dimension by adopting a stack structure to obtain state information. The network architecture diagram shown in fig. 3 is constructed, the network architecture diagram comprises a spatial feature and temporal feature extraction module, and a fusion value distributed reinforcement learning algorithm is used for realizing anti-collision guidance and navigation path planning of the unmanned surface vessel in a high-risk environment. The method provided by the invention is mainly used for a navigation path planning module of the unmanned surface vessel, provides a real-time and high-safety path planning algorithm for the unmanned surface vessel in a high-dynamic water surface strong interference environment, and specifically comprises the following steps:
And 1, defining a local navigation area range, setting starting point coordinates and end point coordinates of the unmanned surface vehicle in the area, and randomly generating vortex signals in the area.
And 2, constructing a navigation environment, and randomly generating static obstacles with different sizes and dynamic obstacles with random positions in the blank area.
And 3, generating simulation signals for providing speed, ship self-position and obstacle relative distance information based on the type of the sensor carried by the real ship, and constructing observation signals according to the simulation signals.
And 4, aggregating the spatial features in an inter-frame stacking mode along the time dimension aiming at the observed signals, and inputting the observed signals into a continuous time neural network frame by frame to aggregate the temporal features.
And 5, determining a distributed reinforcement learning (IQN) algorithm as a backbone framework of a navigation planning network, and defining an action space and a reward function.
And 6, collecting experience bar data in the training process by adopting an experience pool, optimizing network weight parameters by using a small batch of experience bars, and training a deep reinforcement learning model by using an experience playback strategy until convergence.
And 7, constructing an experimental verification scene, loading converged model parameters, and verifying the anti-collision capability of the ship in a local environment and the generalization capability of the ship in response to a new scene.
Further, the step 1 specifically includes the following steps:
1-1, defining a local sailing area range of the unmanned surface vehicle, and setting a length value as H and a width value as W.
1-2 Initializing the starting coordinates PSTA (x, y) and the ending coordinates PEND (x, y) of the unmanned surface vehicle. In the current local sailing area range, the unmanned ship coordinates at each moment are P (x, y).
Further, the step 2 specifically includes the following steps:
2-1 determining a random operator, and randomly initializing the position coordinates of the static obstacle in a blank areaRadius sizeAnd randomly selecting the average distribution.
2-2 For dynamic obstacle, randomly generating its initial position coordinates on the remaining blank areaThe terminal point is set as a position coordinate P (x, y) of the unmanned surface vessel at the current moment, namely the terminal point is continuously changed along with the movement of the unmanned surface vessel.
Further, step 3 includes the steps of:
and 3-1, simulating the working principle and the output characteristic of each sensor based on the type of the ship-borne sensor, and generating an observation signal at the current moment.
And 3-2, generating satellite and inertial device signals to provide the self position P (x, y) of the ship.
And 3-3, generating Doppler velocimeter signals through simulation, and providing the current ground speed V (Vx,vy) of the ship.
3-4, Simulating to generate a laser radar signal, and providing relative distance information D (D1,d2,…,dN) of the ship relative to the obstacles in the surrounding environment, wherein N is the number of laser beams generated when simulating the laser radar signal.
And 3-5, splicing the position information, the speed information and the obstacle relative distance information of the ship, and using the spliced information as an observation signal st={P(x,y),V(vx,vy),D(d1,d2,…,dN under the current environment.
Further, step 4 includes the steps of:
4-1, designing a stack structure aiming at the observation signals St spliced in the step 3, stacking the signals along the time dimension, and constructing spatial characteristic information St={st-k,…,st-1,st which is rich in the spatial position change of the obstacle, wherein St={P(x,y),V(vx,vy),D(d1,d2,…,dN). And extracting motion characteristics from the spatial characteristic information by adopting a rectangular convolutional neural network.
4-2, On the basis of performing feature aggregation on the space feature information by adopting a neural network, further extracting the motion dependency characteristics in the space feature information by adopting a continuous time neural network, and acquiring the identification features of the environment observation signals in the time dimension. Specifically, the continuous-time neural network is composed of subnetworks { f, g, h } together, and the three subnetworks share a basic skeleton network structure.
Further, step 5 includes the steps of:
5-1, determining a value distributed reinforcement learning (IQN) algorithm as a network framework, and guiding a decision by using the return distribution information by modeling the distribution information of accumulated returns, so as to avoid ignoring risk changes in a scene by using return average information in all states.
5-2 Defining the motion space as the acceleration and angular velocity changes in discrete space
5-3 Determining a reward function rt, defining an arrival endpoint reward factor and a collision penalty factor with an obstacle, defining a time penalty factor, and defining an endpoint-directed reward function at the current time
Further, step 6 includes the steps of:
6-1, initializing a deep reinforcement learning network, initializing an input state st, selecting an action at under the current input state, executing the action to obtain a current moment rewarding value rt and a next moment state st+1, and storing an experience bar (st,at,rt,st+1) in an experience playback pool.
And 6-2, determining the total number of experience playback pools, and when the accumulated training experience bars exceed the total number, extracting experience bar data in small batches, solving gradients by adopting a gradient descent algorithm in the back propagation process, and finishing updating network parameters until the model converges.
Further, step 7 includes the steps of:
7-1, the properties of the vortex, the static obstacle and the dynamic obstacle in the experimental scene are consistent with those in the training scene, but the positions and the numbers of the vortex, the static obstacle and the dynamic obstacle are randomly initialized according to the experimental requirements.
Example 1
The invention relates to a local obstacle avoidance technology of a water surface unmanned ship, which aims to guide the unmanned ship to autonomously plan a navigation path in a water surface environment with dense obstacles, high dynamic state and high risk. In an embodiment, by training a deep reinforcement learning network model in a simulation environment and taking the verification of the effectiveness of an algorithm in a single scene as an example, the specific steps of unmanned ship local anti-collision and path planning are elaborated, and related parameter configuration and operation flows are as follows:
and 1, defining a local obstacle avoidance area range to be 50 x 50m, setting a navigation starting point of the unmanned ship to be (5, 5), and setting a navigation end point of the unmanned ship to be (45, 45). The unmanned ship has a ship length of 1.255m and a ship width of 0.29m. The vortex locations are randomly initialized.
Step 2, randomly initializing static barriers in a blank area, wherein the radius of the barriers is uniformly distributed according to [1,3] m. Randomly initializing the starting coordinates of the obstacle of the dynamic chaser, wherein the ship length of the chaser is 1m, and the ship width is 0.5m.
And 3, constructing a composite observation vector shown in fig. 2, wherein the composite observation vector comprises the ship position provided by signals from satellites and inertial devices, speed information acquired by a Doppler velocimeter and relative distance information acquired by a laser radar. The stack length is set to 4, and four continuous frame observation vectors are stacked to be used as the input of the deep reinforcement learning network.
And 4, sending the composite observation signals into a spatial feature extraction and temporal feature extraction module constructed as shown in fig. 2, reducing the data dimension, simultaneously aggregating the spatial and temporal features of the environmental observation signals, and outputting action signals through a value distribution reinforcement learning network decision.
Step 5, the action space comprises acceleration change [ -0.2,0,0.2] m/s2 and angular velocity change [ -5,0,5 ]/s under discrete space. The target rewarding value 50 is set in the rewarding function, the collision punishment value-100 and the time punishment-1 are generated, and the difference value between the last moment distance end point and the current moment distance end point is taken as a sub rewarding factor.
And step 6, training the neural network by adopting an empirical return visit strategy until convergence.
And 7, loading the network parameters trained to be converged in the step 6 into a model, randomly generating a local navigation environment, and verifying the anti-collision capability and navigation path planning capability under the local environment of the ship to obtain an experimental result shown in fig. 4. Wherein the orange rectangle represents a dynamic chaser, the green rectangle represents a surface unmanned boat, and the gray circle represents a static obstacle.
It will be understood by those skilled in the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
While the foregoing is directed to embodiments of the present invention, other and further details of the invention may be had by the present invention, it should be understood that the foregoing description is merely illustrative of the present invention and that no limitations are intended to the scope of the invention, except insofar as modifications, equivalents, improvements or modifications are within the spirit and principles of the invention.

Claims (10)

Translated fromChinese
1.一种聚合时空特征的水面无人艇局部抗追击方法,其特征在于,该方法包括:1. A local anti-pursuit method for an unmanned surface vehicle by aggregating spatiotemporal features, characterized in that the method comprises:步骤1:划定局部航行区域范围,设定水面无人艇在该区域内起点和终点坐标,在该区域内随机生成涡流信号;Step 1: Define the local navigation area, set the starting and ending coordinates of the surface unmanned vehicle in the area, and randomly generate eddy current signals in the area;步骤2:构建航行环境,在空白区域内随机生成大小不一的静态障碍物,以及位置随机的动态障碍物;Step 2: Construct a navigation environment and randomly generate static obstacles of varying sizes and dynamic obstacles of random positions in the blank area.步骤3:基于真实船舶搭载的传感器类型,生成提供速度、船舶自身位置以及障碍物相对距离信息的仿真信号,并以此构建观测信号;Step 3: Based on the sensor types carried by the real ship, a simulated signal is generated that provides information on speed, ship position, and relative distance to obstacles, and the observation signal is constructed from this information.步骤4:将观测信号送入时空特征聚合模块,沿时间维度以帧间堆叠方式聚合空间特征;并将观测信号逐帧输入到连续时间神经网络聚合时间特征;Step 4: The observation signal is fed into the spatiotemporal feature aggregation module to aggregate spatial features along the time dimension in a frame-by-frame stacking manner; and the observation signal is fed into the continuous-time neural network frame by frame to aggregate temporal features;步骤5:确定分布式强化学习算法作为导航规划网络的骨干框架,定义动作空间和奖励函数;Step 5: Determine the distributed reinforcement learning algorithm as the backbone framework of the navigation planning network and define the action space and reward function;步骤6:采用经验池收集训练过程中经验条数据,小批次经验条优化网络权重参数,经验回放策略训练深度强化学习模型直至收敛;Step 6: Use the experience pool to collect experience data during training, optimize network weight parameters with small batches of experience data, and train the deep reinforcement learning model with the experience replay strategy until convergence;步骤7:搭建实验验证场景,加载已收敛模型参数,验证船舶在局部环境中抗追击避碰能力及应对新场景的泛化能力。Step 7: Build an experimental verification scenario, load the converged model parameters, and verify the ship's anti-pursuit and collision avoidance capabilities in the local environment and its generalization capabilities to cope with new scenarios.2.根据权利要求1所述的一种聚合时空特征的水面无人艇局部抗追击方法,其特征在于,所述局部航行区域为验证算法有效性搭建的仿真环境。2. The local anti-pursuit method for surface unmanned boats that aggregates spatiotemporal features according to claim 1 is characterized in that the local navigation area is a simulation environment built to verify the effectiveness of the algorithm.3.根据权利要求1所述的一种聚合时空特征的水面无人艇局部抗追击方法,其特征在于,所述时空特征聚合模块由独立的空间特征聚合模块和时间特征提取模块串联而成。3. A local anti-pursuit method for a surface unmanned boat that aggregates spatiotemporal features according to claim 1 is characterized in that the spatiotemporal feature aggregation module is composed of an independent spatial feature aggregation module and a temporal feature extraction module in series.4.根据权利要求1所述的一种聚合时空特征的水面无人艇局部抗追击方法,其特征在于,所述步骤1具体包括以下步骤:4. The method for local anti-pursuit of an unmanned surface vehicle by aggregating spatiotemporal features according to claim 1, wherein step 1 specifically comprises the following steps:1-1:划定水面无人艇局部航行区域范围,设长度值为H,宽度值为W;1-1: Define the local navigation area of the surface unmanned vehicle, set the length to H and the width to W;1-2:初始化水面无人艇起始坐标PSTA(x,y),终点坐标PEND(x,y),在当前局部航行区域范围内,无人艇在各时刻下坐标为P(x,y)。1-2: Initialize the starting coordinates PSTA (x, y) and the end coordinates PEND (x, y) of the surface unmanned boat. Within the current local navigation area, the coordinates of the unmanned boat at each moment are P (x, y).5.根据权利要求1所述的一种聚合时空特征的水面无人艇局部抗追击方法,其特征在于,所述步骤2具体包括以下步骤:5. The method for local anti-pursuit of an unmanned surface vehicle by aggregating spatiotemporal features according to claim 1, wherein step 2 specifically comprises the following steps:2-1:确定随机算子,针对静态障碍物,在空白区域内随机初始化其位置坐标半径大小服从均值分布随机选取;si表示第i个静态障碍物;2-1: Determine the random operator and randomly initialize the position coordinates of static obstacles in the blank area Radius size Randomly selected according to the mean distribution;si represents the i-th static obstacle;2-2:针对动态障碍物,在剩余空白区域上随机生成其初始位置坐标终点设定为水面无人艇在当前时刻下的位置坐标P(x,y),即终点随着无人艇的运动而不断变化,dj表示第j个动态障碍物。2-2: For dynamic obstacles, randomly generate their initial position coordinates in the remaining blank area The end point is set as the position coordinate P(x, y) of the surface unmanned boat at the current moment, that is, the end point changes with the movement of the unmanned boat, anddj represents the jth dynamic obstacle.6.根据权利要求1所述的一种聚合时空特征的水面无人艇局部抗追击方法,其特征在于,所述步骤3包括以下步骤:6. The method for local anti-pursuit of an unmanned surface vehicle by aggregating spatiotemporal features according to claim 1, wherein step 3 comprises the following steps:3-1:基于船舶船载传感器类型,仿真各传感器工作原理及输出特性,生成当前时刻观测信号;3-1: Based on the types of ship-borne sensors, simulate the working principles and output characteristics of each sensor to generate the current observation signal;3-2:生成卫星与惯性器件信号,提供船舶自身位置P(x,y);3-2: Generate satellite and inertial device signals to provide the ship's own position P(x,y);3-3:仿真生成多普勒测速仪信号,提供船舶当前对地航速V(vx,vy);3-3: Simulate and generate Doppler speed meter signals to provide the ship's current speed over the ground V(vx ,vy );3-4:仿真生成激光雷达信号,提供船舶相对于周围环境中障碍物的相对距离信息D(d1,d2,…,dN),其中N为仿真激光雷达信号时所产生的激光光束数目;3-4: Simulate and generate lidar signals to provide relative distance information D(d1 ,d2 ,…,dN ) of the ship relative to obstacles in the surrounding environment, where N is the number of laser beams generated when simulating the lidar signal;3-5:将船舶自身位置信息、速度信息以及障碍物相对距离信息拼接,用作当前环境下的观测信号st={P(x,y),V(vx,vy),D(d1,d2,…,dN)}。3-5: The ship's own position information, speed information and obstacle relative distance information are spliced together and used as the observation signal st ={P(x,y),V(vx ,vy ),D(d1 ,d2 ,…,dN )} in the current environment.7.根据权利要求1所述的一种聚合时空特征的水面无人艇局部抗追击方法,其特征在于,所述步骤4包括以下步骤:7. The method for local anti-pursuit of an unmanned surface vehicle by aggregating spatiotemporal features according to claim 1, wherein step 4 comprises the following steps:4-1:针对步骤3拼接而成的观测信号st,设计堆栈结构,将信号沿着时间维度进行堆叠,构建富含障碍物空间位置变化的空间特征信息St={st-k,…,st-1,st},其中st={P(x,y),V(vx,vy),D(d1,d2,…,dN)};对该空间特征信息采用矩形卷积神经网络提取运动特征;4-1: For the observation signal st spliced in step 3, design a stack structure, stack the signal along the time dimension, and construct spatial feature information St = {stk ,…,st-1 ,st }, which is rich in the spatial position changes of obstacles. The spatial feature information is rich in the spatial position changes of obstacles, where st = {P(x,y),V(vx ,vy ),D(d1 ,d2 ,…,dN )}. The motion features are extracted from this spatial feature information using a rectangular convolutional neural network.4-2:对空间特征信息采用神经网络进行特征聚合的基础上,采用连续时间神经网络进一步提取其内部的运动依赖特性,获取环境观测信号在时间维度上的辨识特征;具体地连续时间神经网络由子网络{f,g,h}共同构成,且三个子网络共享基础骨架网络结构。4-2: Based on the feature aggregation of spatial feature information using neural networks, a continuous-time neural network is used to further extract its internal motion-dependent characteristics and obtain the identification characteristics of environmental observation signals in the time dimension; specifically, the continuous-time neural network is composed of sub-networks {f, g, h}, and the three sub-networks share the basic skeleton network structure.8.根据权利要求1所述的一种聚合时空特征的水面无人艇局部抗追击方法,其特征在于,所述步骤5包括以下步骤:8. The method for local anti-pursuit of an unmanned surface vehicle by aggregating spatiotemporal features according to claim 1, wherein step 5 comprises the following steps:5-1:确定值分布式强化学习IQN算法为网络框架,通过对累计回报的分布信息进行建模,利用回报分布信息引导决策,避免在所有状态下均利用回报均值信息而忽略场景中的风险变化;5-1: The Deterministic Value Distributed Reinforcement Learning (IQN) algorithm is a network framework that models the distribution of cumulative returns and uses this information to guide decision-making, avoiding the use of mean return information in all states while ignoring changes in risk in the scenario.5-2:将动作空间定义为离散空间下加速度及角速度变化at={α,ω};α是加速度,是角速度;5-2: Define the action space as the acceleration and angular velocity changes in discrete space at = {α, ω}; α is the acceleration, is the angular velocity;5-3:确定奖励函数rt,定义到达终点奖励因子及与障碍物碰撞惩罚因子,定义时间惩罚因子,以及定义当前时刻t下终点g导向奖励函数5-3: Determine the reward function rt , define the reward factor for reaching the end point and the penalty factor for colliding with obstacles, define the time penalty factor, and define the end point g guidance reward function at the current time t9.根据权利要求1所述的一种聚合时空特征的水面无人艇局部抗追击方法,其特征在于,所述步骤6包括以下步骤:9. The method for local anti-pursuit of an unmanned surface vehicle by aggregating spatiotemporal features according to claim 1, wherein step 6 comprises the following steps:6-1:初始化深度强化学习网络,初始化输入状态st,在当前输入状态下选取动作at,执行该动作得到当前时刻奖励值rt以及下一时刻状态st+1,将经验条(st,at,rt,st+1)存入经验回放池;6-1: Initialize the deep reinforcement learning network, initialize the input state st , select the action at under the current input state, execute the action to obtain the current moment reward valuert and the next moment state st+1 , and store the experience bar (st , at , rt , st+1 ) in the experience replay pool;6-2:确定经验回放池总数,当训练经验条累计超过该总数时,小批量抽取经验条数据,反向传播过程中采用梯度下降算法求梯度,完成网络参数更新直至模型收敛。6-2: Determine the total number of experience replay pools. When the cumulative number of training experience bars exceeds this total number, extract experience bar data in small batches. Use the gradient descent algorithm to calculate the gradient during the backpropagation process to complete the network parameter update until the model converges.10.根据权利要求1所述的一种聚合时空特征的水面无人艇局部抗追击方法,其特征在于,所述步骤7包括以下步骤:10. The method for local anti-pursuit of an unmanned surface vehicle by aggregating spatiotemporal features according to claim 1, wherein step 7 comprises the following steps:7-1:实验场景中涡流、静态障碍物、动态障碍物属性与训练场景中保持一致,但其位置和数目随实验要求进行随机初始化。7-1: The properties of vortices, static obstacles, and dynamic obstacles in the experimental scene remain consistent with those in the training scene, but their positions and numbers are randomly initialized according to the experimental requirements.
CN202411799637.1A2024-12-092024-12-09 A local anti-pursuit method for surface unmanned vehicles based on aggregation of spatiotemporal featuresActiveCN119756364B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202411799637.1ACN119756364B (en)2024-12-092024-12-09 A local anti-pursuit method for surface unmanned vehicles based on aggregation of spatiotemporal features

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202411799637.1ACN119756364B (en)2024-12-092024-12-09 A local anti-pursuit method for surface unmanned vehicles based on aggregation of spatiotemporal features

Publications (2)

Publication NumberPublication Date
CN119756364A CN119756364A (en)2025-04-04
CN119756364Btrue CN119756364B (en)2025-09-09

Family

ID=95174313

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202411799637.1AActiveCN119756364B (en)2024-12-092024-12-09 A local anti-pursuit method for surface unmanned vehicles based on aggregation of spatiotemporal features

Country Status (1)

CountryLink
CN (1)CN119756364B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN112327850A (en)*2020-11-062021-02-05大连海事大学Unmanned surface vehicle path planning method
CN118732705A (en)*2024-05-282024-10-01浙江大年科技有限公司 Autonomous obstacle-avoiding drone based on deep learning and reinforcement learning

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN110703762B (en)*2019-11-042022-09-23东南大学Hybrid path planning method for unmanned surface vehicle in complex environment
CN114942643B (en)*2022-06-172024-05-14华中科技大学Construction method and application of USV unmanned ship path planning model
US20240394568A1 (en)*2023-05-242024-11-28Global Spatial Technology Solutions Inc.System and method for vessel identification
CN117519197A (en)*2023-12-012024-02-06中国船舶集团有限公司系统工程研究院 A local path planning method and device for surface unmanned boats

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN112327850A (en)*2020-11-062021-02-05大连海事大学Unmanned surface vehicle path planning method
CN118732705A (en)*2024-05-282024-10-01浙江大年科技有限公司 Autonomous obstacle-avoiding drone based on deep learning and reinforcement learning

Also Published As

Publication numberPublication date
CN119756364A (en)2025-04-04

Similar Documents

PublicationPublication DateTitle
Liu et al.Self-adaptive dynamic obstacle avoidance and path planning for USV under complex maritime environment
CN110333739B (en)AUV (autonomous Underwater vehicle) behavior planning and action control method based on reinforcement learning
Ouahouah et al.Deep-reinforcement-learning-based collision avoidance in UAV environment
Guan et al.Autonomous collision avoidance of unmanned surface vehicles based on improved A-star and dynamic window approach algorithms
Cao et al.Target search control of AUV in underwater environment with deep reinforcement learning
CN101408772B (en) AUV Intelligent Collision Avoidance Method
CN111829527A (en) A path planning method for unmanned ships based on deep reinforcement learning and considering marine environment elements
CN109241552A (en)A kind of underwater robot motion planning method based on multiple constraint target
Yan et al.Reinforcement Learning‐Based Autonomous Navigation and Obstacle Avoidance for USVs under Partially Observable Conditions
CN115167447B (en) Intelligent obstacle avoidance method for unmanned boat based on end-to-end deep reinforcement learning of radar images
CN111273670A (en)Unmanned ship collision avoidance method for fast moving barrier
CN118153431A (en)Underwater multi-agent cooperative trapping method and device based on deep reinforcement learning
Qin et al.An environment information-driven online Bi-level path planning algorithm for underwater search and rescue AUV
CN112800545B (en)Unmanned ship self-adaptive path planning method, equipment and storage medium based on D3QN
CN117590867A (en)Underwater autonomous vehicle connection control method and system based on deep reinforcement learning
CN117519197A (en) A local path planning method and device for surface unmanned boats
Song et al.Surface path tracking method of autonomous surface underwater vehicle based on deep reinforcement learning
Zhang et al.Path planning of USV in confined waters based on improved A∗ and DWA fusion algorithm
Qu et al.USV path planning under marine environment simulation using DWA and safe reinforcement learning
CN119396156B (en) A scalable collaborative obstacle avoidance decision-making method for unmanned watercraft swarms
Zhang et al.Ship collision avoidance using constrained deep reinforcement learning
CN119756364B (en) A local anti-pursuit method for surface unmanned vehicles based on aggregation of spatiotemporal features
CN118466483A (en) A local navigation planning method for unmanned boat based on beam diagram state input
CN119045484A (en)Distributed unmanned ship formation control method based on attention mechanism SAC, electronic equipment and storage medium
Jose et al.Navigating the ocean with drl: Path following for marine vessels

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp