Movatterモバイル変換


[0]ホーム

URL:


CN120141447A - Reinforcement learning adaptive multimodal SLAM method based on 4D Gaussian splashing - Google Patents

Reinforcement learning adaptive multimodal SLAM method based on 4D Gaussian splashing
Download PDF

Info

Publication number
CN120141447A
CN120141447ACN202510607098.5ACN202510607098ACN120141447ACN 120141447 ACN120141447 ACN 120141447ACN 202510607098 ACN202510607098 ACN 202510607098ACN 120141447 ACN120141447 ACN 120141447A
Authority
CN
China
Prior art keywords
sensor
data
imu
gaussian
reinforcement learning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202510607098.5A
Other languages
Chinese (zh)
Other versions
CN120141447B (en
Inventor
刘丹
张宇
张盈盈
丁禹翔
刘云鹤
韩志凤
张健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong University of Science and Technology
Original Assignee
Shandong University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong University of Science and TechnologyfiledCriticalShandong University of Science and Technology
Priority to CN202510607098.5ApriorityCriticalpatent/CN120141447B/en
Publication of CN120141447ApublicationCriticalpatent/CN120141447A/en
Application grantedgrantedCritical
Publication of CN120141447BpublicationCriticalpatent/CN120141447B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Classifications

Landscapes

Abstract

Translated fromChinese

本申请所述的基于4D高斯泼溅的强化学习自适应多模态SLAM方法,属于同时定位与地图构建领域。提出在现有多传感器融合SLAM系统基础上进一步结合4D GS和强化学习的整体解决方案,通过强化学习根据实时环境信息自主地选择传感器模式,同时利用4D GS进行地图更新和优化并整合到SLAM系统中,从而达到提高整个实时多模态SLAM系统的工作效率、实时性和节省资源的目的。包括下述步骤:步骤(1)、多源传感器输入与数据预处理;步骤(2)、选择传感器组合类型;步骤(3)、多源传感器数据融合处理;步骤(4)、地图更新和优化。

The reinforcement learning adaptive multimodal SLAM method based on 4D Gaussian splashing described in this application belongs to the field of simultaneous positioning and map construction. It proposes an overall solution that further combines 4D GS and reinforcement learning on the basis of the existing multi-sensor fusion SLAM system, autonomously selects the sensor mode according to the real-time environmental information through reinforcement learning, and uses 4D GS to update and optimize the map and integrate it into the SLAM system, so as to achieve the purpose of improving the working efficiency, real-time performance and resource saving of the entire real-time multimodal SLAM system. It includes the following steps: step (1), multi-source sensor input and data preprocessing; step (2), selecting the sensor combination type; step (3), multi-source sensor data fusion processing; step (4), map update and optimization.

Description

4D Gaussian splatter-based reinforcement learning self-adaptive multi-mode SLAM method
Technical Field
The application relates to the field of simultaneous localization and mapping, in particular to a 4D Gaussian splatter (4D Gaussian Splatting,4D GS) self-adaptive multi-modal online SLAM method suitable for a dynamic environment and based on reinforcement learning.
Background
The simultaneous localization and mapping SLAM method is a technology that utilizes multi-sensor (e.g., camera, lidar, IMU, etc.) data to estimate sensor pose and simultaneously reconstruct a 3D map of the surrounding environment. The SLAM technology can solve the problems of autonomous positioning and navigation of a robot or equipment in an unknown environment, and is one of key technologies for realizing a fully autonomous mobile robot.
Existing 3D gaussian splats is a technique for 3D scene reconstruction and rendering by using gaussian functions to represent points or surfaces in the scene and processing in 3D space, but 3D GS focuses on static scenes. It can be extended to dynamic scenes as a 4D representation, a key technical issue being how to model complex point movements from sparse inputs. 4D gaussian splatter can be seen as an extension of 3D gaussian splatter, which adds a fourth dimension to the 3D space, typically used to process dynamic scenes or time-varying data. 4D gaussian splats have advantages in processing dynamic content, such as modeling and rendering of dynamic objects, scene reconstruction under time-varying lighting conditions, and the like.
SLAM technology has hitherto developed SLAM systems for laser, vision and IMU multisensor fusion. As disclosed in the prior published chinese patent application, with application number CN202410154510.8, entitled "dense visual SLAM method and system using three-dimensional gaussian back-end characterization", a dense visual SLAM method using three-dimensional gaussian back-end characterization is proposed, by constructing three-dimensional gaussian scene characterization, performing adaptive three-dimensional gaussian expansion, reconstructing scene geometry, and performing coarse-to-fine camera pose estimation based on reconstructed scene geometry. Challenges are faced in handling dynamic objects or fast changing scenes, and the computational cost is relatively high, limiting their application in resource constrained environments.
In another example, the application number is CN202411103239.1, the name is a Gaussian splatter-based laser enhanced visual three-dimensional reconstruction method and system, the method is divided into two parts of pose calculation and scene reconstruction, the relative rotation is solved by utilizing image feature matching during pose calculation, the point cloud ICP pose is initialized, the relative pose is optimized, a three-dimensional Gaussian sphere is used for representing a scene during scene reconstruction, a micro-renderable synthetic image is supervised by utilizing an image of the high-precision pose and a depth map generated by point cloud projection, and parameters of the three-dimensional Gaussian sphere are optimized to realize scene reconstruction. The dependence on the laser radar data may cause that the reconstruction effect is affected under the condition of inaccurate or shielding of the laser radar data, and the problems of inconsistent data, calibration error and the like are faced in the data fusion process, so that the overall complexity of the algorithm is higher, and the requirement on calculation resources is higher.
In another example, the application number is CN202410277176.5, and the name is "a dense gaussian map reconstruction method and system under dynamic environment", which uses example segmentation to divide potential moving objects, obtain dynamic priori regions, uses dynamic point detection algorithm to detect dynamic feature points in the segmented regions and reject the dynamic feature points, obtains camera pose estimation information according to static feature points, and performs dense map reconstruction on static regions by combining a visual SLAM system and a 3D Gaussian Splatting frame. The definition and detection of dynamic objects depends on instance segmentation and a preset dynamic prior area, cannot adapt to all types of dynamic scenes, especially has insufficient scene adaptability to rapid or complex dynamic changes, and the dependence on the optical flow method and the epipolar geometry method leads to inaccurate dynamic point detection in certain situations.
The multi-mode SLAM technology in the prior art mainly has the following defects and defects that firstly, the SLAM system is lost in tracking, fails in repositioning and the like due to dynamic objects in a scene, namely, the SLAM technology faces very serious degradation problems when working in a dynamic complex and changeable environment. Especially, pose estimation and map construction effects are poor in a low texture area, limited visual angle constraint and dynamic environment, and dynamic objects cause problems of large track drift, closed loop detection failure, ghost image in map construction and the like of an SLAM system. Secondly, because fixed rules such as weighted fusion and static priority are adopted more commonly, sudden environmental changes caused by sensor shielding or sudden illumination changes are difficult to deal with, and multiple sensors operate simultaneously to cause redundant calculation, so that the power consumption and delay of mobile terminal equipment are increased. Thirdly, the existing NeRF method bears great training and rendering cost, and the rendering process has a non-negligible delay problem. Fourthly, the scene is expressed as 3D Gaussian, and the requirement of the dynamic scene on the real-time rendering speed cannot be met.
In view of this, the present application has been made.
Disclosure of Invention
The 4D Gaussian splatter-based reinforcement learning self-adaptive multi-mode SLAM method provided by the application aims to solve the problems existing in the prior art, and provides an overall solution of further combining 4D GS and reinforcement learning on the basis of the existing multi-sensor fusion SLAM system, wherein the sensor mode is autonomously selected according to real-time environment information through reinforcement learning, and meanwhile, map updating and optimization are carried out by utilizing the 4D GS and integrated into the SLAM system, so that the purposes of improving the working efficiency, instantaneity and saving resources of the whole real-time multi-mode SLAM system are achieved.
In order to achieve the above design objective, the reinforcement learning adaptive multi-mode SLAM method based on 4D gaussian splatter includes the following steps:
step 1, multi-source sensor input and data preprocessing;
Receiving point cloud data from a laser radar, RGB image data from a depth camera and IMU data, and performing data preprocessing;
Step 2, selecting a sensor combination type;
Based on the reinforcement learning sensor selection module, the environment state is learned in real time through a reinforcement learning strategy network, the most reliable sensor combination type is dynamically selected, and the sensor is activated as required;
step 3, multi-source sensor data fusion processing;
based on the data fusion processing module, the sensor combination is used for carrying out multi-source sensor data fusion processing;
step 4, updating and optimizing the map;
Based on the map updating and rear-end optimizing module, the map data provided by the front-end data fusion processing module is utilized to update and optimize the Gaussian map of the environment map by combining 4D GS, and error correction and map optimization are carried out through loop detection.
Further, the step 1 comprises the following steps,
Step 1.1, data input;
the SLAM system synchronizes the point cloud data received from the laser radar, the RGB image data from the depth camera and the IMU data in time;
step 1.2, time information is fused in;
introducing time stamps in a data preprocessing stageEach laser radar point cloud and image frame is provided with a time stamp;
step 1.3, calibrating multi-sensor external parameters;
calibrating external parameters between the sensors by using Kalibr frames;
Step 1.4, initializing a reinforcement learning module;
the method comprises the steps of initializing a reinforcement learning sensor selection module in a data preprocessing stage, learning a strategy for selecting an optimal sensor under different scene conditions by analyzing historical data and environmental characteristics, and adopting a random exploration strategy in an initial state.
Further, the step 2 comprises the following steps,
Step 2.1, designing a state space;
defining environmental information which can be observed by the reinforcement learning system at each decision moment;
Step 2.2, defining an action space formula;
including the acts of defining a system switching sensor mode to optimize data acquisition and system performance;
The SLAM system adopts a mixed discrete-continuous space to realize fine control, and the mode selection part is discrete and comprises three sensor dominant modes, namely a depth camera-IMU mode, a laser radar-depth camera mode and a laser radar-IMU-depth camera mode;
The weight distribution part is continuous and comprises a sensor gain coefficient and a resource limiting parameter, wherein the sensor gain coefficient is the weight of the laser radar, the depth camera and the IMU, and the weight range is between 0 and 1, and the expression is as follows:
(9)
The resource limiting parameter is the laser radar sampling rate, the range is between 5Hz and 40Hz, and the expression is as follows:
(10)
step 2.3, rewarding a function module;
Including primary rewards, auxiliary rewards, and penalty items;
step 2.4, designing a network architecture;
The system comprises a state encoder which is used as the basis of the whole network and is responsible for converting the multi-mode sensor data into a unified characteristic representation so that the subsequent strategy network and the value network can make decisions and evaluations based on the characteristics;
the strategy network is responsible for outputting specific action selection according to the coded state characteristics, wherein the specific action selection comprises discrete action modes and continuous parameter adjustment;
the value network is used for evaluating the advantages and disadvantages of the current state and action combination, providing learning signals for the strategy network and helping the strategy network to optimize the decision process;
Step 2.5, executing a training strategy;
through an efficient data collection mechanism, an intelligent exploration strategy and strict safety learning constraint, a comprehensive training framework is provided for the self-adaptive selection module of the reinforcement learning sensor, which comprises,
The data collection mechanism provides experience data interacted with the environment for the system and is the basis of learning and optimizing strategies;
exploration strategies, deciding how the system effectively explores in an unknown or partially known environment to obtain more information and experience;
Safety learning constraints ensure that the system not only pursues high performance in the learning and decision process, but also meets a series of safety and practicality requirements.
Further, the step 2.1 comprises the steps of,
Geometric dynamic characteristic design comprising laser radar point cloud density gradient and dynamic object coverage rate;
the laser radar point cloud density gradient reflects the density change of the point cloud in space, helps the system to perceive the geometric structure and potential obstacle of the environment, quantifies the observation confidence of the laser radar in the local area, has the expression as follows,
(1)
Wherein,Represents the number of lidar hits within a unit voxel,The volume of the voxel is represented and,Representing a range of normalized values of the reflected intensity;
calculating the coverage rate of dynamic objects to represent the proportion and distribution of the dynamic objects in the scene, so that the system can know the dynamic characteristics of the environment in time, the expression is as follows,
(2)
Wherein,Representing an estimated velocity vector of the dynamic object,Representing the projected area of the dynamic object detection frame,Representing a field of view area of the sensor;
Sensory reliability, including IMU confidence and radar-visual depth consistency, the IMU confidence reflects the confidence of the inertial measurement unit data, calculated according to its noise level and bias stability, the higher the result value, the more reliable the IMU data is indicated, the expression is as follows:
(3)
Wherein,Representing the zero offset estimation error of the gyroscope,Represents accelerometer noise standard deviation (sliding window calculation);
The radar-vision depth consistency measures the consistency of the vision sensor and the radar depth data, the higher the consistency is calculated by comparing the depth measured values of the vision sensor and the radar depth data, the more coordinated the depth data of the two sensors is shown, and the expression is as follows:
(4)
Wherein,Representing a histogram of the observed depth distribution,Representing a depth distribution histogram of the depth camera,Representing the cross ratio of the effective areas of the two;
The system resource state monitoring comprises real-time computing power load and energy residual budget, wherein the real-time computing power load reflects the current computing pressure of the system and helps a module to determine whether the sensor data processing amount needs to be reduced or the computing task needs to be simplified, and the expression is as follows:
(5)
Wherein,Indicating the current GPU's used memory capacity,The total memory capacity of the GPU is represented,Representing that single frame rendering is time consuming,Representing a desired frame period;
the energy residual budget indicates the energy reserve condition of the system, so that the module selects a sensor or adjusts the working mode based on the current energy residual budget, and the expression is as follows:
(6)
Wherein,Indicating the current remaining amount of battery,Indicating the initial total amount of power,Indicating that the system has been in operation for a time,Representing an electrical quantity attenuation factor;
The space-time context comprises a historical decision memory and a scene category probability, wherein the historical decision memory records the success rate and effect of past sensor selection decisions, helps a module to learn from experience, avoids repeated errors and improves the decision efficiency, and the expression is as follows:
(7)
Wherein,Indicating the past thThe step-by-step action is performed,Representing a long-short term memory network;
The scene category probability calculates the probability of belonging to a low-texture scene, a limited view constraint scene and a dynamic scene according to the current environment characteristics, so that a module can adjust a sensor selection strategy according to scene characteristics, and the expression is as follows:
(8)
Wherein,Representing the image-text joint embedding vector extracted by the CLIP model,Representing a matrix of the weight that can be learned,To normalize the distribution, the sum is 1.
Further, the step 2.3 comprises,
The main rewards comprise positioning accuracy rewards and energy efficiency economic rewards which are respectively used for improving the positioning accuracy of the system and the energy utilization efficiency, and the expression is as follows:
(11)
(12)
where ATE represents the absolute track error, and,Representing the total instantaneous power consumption,A bonus compensation item representing when the remaining amount of power exceeds a threshold;
Auxiliary rewards, including rendering quality rewards and strategy stability rewards, are aimed at improving visual effect of scene reconstruction and ensuring continuity and reliability of strategy, and are expressed as follows:
(13)
(14)
Wherein SSIM represents a structural similarity index, LPIPS represents learning-aware image block similarity,Representing the penalty of the action mutation,Representing policy network parameter smoothness constraints;
the punishment items comprise fatal punishment errors and resource overload punishment and are used for avoiding unreasonable sensor selection and excessive consumption of resources, so that a system is guided to learn an optimal sensor selection strategy to realize comprehensive optimization of system performance, and the expression is as follows:
(15)
(16)
The comprehensive rewards calculation formula is as follows:
(17)
Further, the step3 comprises the following steps,
The reinforcement learning sensor selection module adaptively selects the most suitable sensor mode according to the influence of each parameter on the sensor mode selection according to the current environmental condition and the system state, wherein the weight function expression is as follows:
+++++(21)
Wherein,,,,,AndRespectively represent the laser radar confidenceDynamic object coverageIMU confidenceRadar vision consistencyCalculating force loadAnd energy surplus budgetHigh weight;
step 3.1, a depth camera-IMU mode;
when the reinforcement learning sensor selection module is according to the current environmental characteristics, the weight function S is more than or equal to,For judging the threshold value, the SLAM system selects a sensor of the depth camera and the IMU, the depth camera is taken as a leading part, and the IMU is taken as an auxiliary part;
Step 3.2 lidar-depth camera mode;
When the reinforcement learning sensor selection module is based on the current environmental characteristics, the reinforcement learning sensor selection module is based on the weight function,In order to judge the threshold value, the SLAM system selects a sensor of the laser radar and the depth camera, takes the depth camera as a leading part and takes the laser radar as an auxiliary part;
Step 3.3, a multi-sensor equalization mode of the laser radar-IMU-depth camera;
when the reinforcement learning sensor selection module is based on the current environment characteristics, the reinforcement learning sensor selection module is based on the weight function through the weight functionJudging a sensor mode of combination of the laser radar, the depth camera and the IMU for SLAM system selection;
And 3.4, after the reinforcement learning sensor selection module executes actions and interacts with the environment, recording historical decision memory through the rewarding function module, evaluating through the value network, and optimizing updating of the whole strategy network.
Further, the step 3.1 comprises,
Step 3.1.1, data input and key frame selection;
The frames containing new scene structures or salient feature increases are preferentially selected as key frames, so that the integrity and efficiency of the map are improved;
step 3.1.2, performing point cloud registration by adopting generalized ICP tracking;
select the firstRGB image of frameAnd depth imageGenerating a point cloudWherein each pointAnd calculates covariance matrix of each point. Estimating a current frame source point cloud by GICP algorithmPoint cloud with target mapRelative pose transformation between them;
Step 3.1.3. IMU pre-integration;
The IMU high-frequency measurement value is fused, the inter-frame motion pre-integral quantity is generated and used as an initial guess of GICP, the tracking efficiency is improved, and the robustness and the accuracy of the system are enhanced by combining with a GICP algorithm;
step 3.1.4, updating the pre-integration;
and adopting GICP optimized results to correct drift of IMU integration caused by noise and deviation along with time.
Further, the step 3.2 comprises the following steps,
Step 3.2.1 data input an image from a depth camera and a point cloud from a lidar;
Integration is performed by using calibrated external signals, and time aligned LiDAR point clouds are converted into depth images, wherein the expression is as follows:
(38)
Wherein,In the form of a point cloud of a lidar,AndThe rotation matrix and translation vector of the lidar to the camera coordinate system,Is an intrinsic matrix of the camera;
Step 3.2.2 using an incremental error minimization function;
ensuring accurate correspondence between planes and points, the formula is as follows:
(39)
Wherein,Representing a lidar point cloudIs provided with a plurality of points in the middle,Based on the current attitude estimation passing from the last moment to the world coordinate systemAs a result of the number of iterations,Is closest toIs arranged in the center of the gaussian,Is thatIs defined in the specification.Is a dotIs used for the weight of the (c),Is a regularization term used for enhancing the stability and precision of the error function, and considers the direction error between normal vectors;
Introducing regularization termTo enhance the stability and accuracy of the error function and to take into account the error in the normal direction, the expression is as follows:
(40)
Wherein,Is the normal of the current Gaussian distribution;
step 3.2.3 weight function calculation;
The weight function is calculated as follows:
a. determining the gaussian center within the partial sphere: find all nearest gaussian distribution centers in it, whereIs a sphere center, the spherical surface is provided with a plurality of grooves,Is the radius;
b. calculating a density function of Gaussian points, the density functionCalculated by the following formula:
(41)
Wherein,Is a reconstructed covariance matrix by selecting the smallest variance along the normal directionAnd greater variance in the vertical directionTo construct;
c. simplifying the calculation of the density function, and simplifying the calculation of the density function in the tracking process in order to accelerate the calculation speed:
(42)
d. consistency calculations, for each pointCalculating the normal of the current Gaussian distributionFrom local average normalConsistency of (2);
E. And calculating the complexity texture, namely calculating the local texture complexity for the image area corresponding to each radar point. Radar pointProjecting to a pixel coordinate system of a camera, and acquiring the position of the pixel coordinate system in an imageTaking the same as the center to intercept(=16), Converting the image block into a gray map, calculating the variance of pixel intensities:
(43)
Wherein,For the intensity of the pixel(s),An image block average value;
converting the variance to a texture weight of 0-1 by a sigmoid function:
(44)
Wherein,For a scaling factor, for adjusting variance sensitivity;
f. final weight function, defining final weight functionIs the product of normal consistency, density function and texture complexity, namely
Further, the step 3.3 comprises the following steps,
Step 3.3.1, synchronizing the data input with the hardware;
the laser radar provides sparse but high-precision 3D point cloud, the depth camera captures RGB textures of a scene, and the IMU outputs angular speed and acceleration at high frequency for motion prediction, so that strict alignment of data time stamps of the three are ensured, and time sequence drift is avoided;
Step 3.3.2, key frame selection is carried out through depth camera input;
selecting representative frames from the continuous data stream as key frames, reducing redundancy calculation;
step 3.3.3. IMU data is used for state propagation;
predicting the current state through IMU pre-integration according to the input IMU data and the state of the last key frame, and predicting the current state through IMU pre-integration for state estimation and performing forward prediction on motion de-distortion;
step 3.3.4, de-distorting the laser radar input;
Using IMU predicted continuous pose to make each laser radar pointThe method comprises the following steps of converting a local coordinate system to a global coordinate system at the scanning moment, wherein the expression is as follows:
(45)
Wherein,Representation pointsIs used for the acquisition of the time stamp of (a),Representing the result obtained by IMU interpolationThe pose at the moment.
Further, the step (4) comprises the following steps,
Step 4.1, maintaining a sliding window;
Step 4.2, 4D Gaussian distribution;
step 4.3, introducing an optical flow to solve the 4D GS overfitting problem;
step 4.4, updating and optimizing the map;
Step 4.5 loop detection;
The module can detect potential loop candidates by extracting laser radar and visual features and generating feature descriptors;
confirming a loop hypothesis by using geometric verification and consistency check;
and feeding the verified loop constraint back to the map optimization process to globally optimize the map structure.
In summary, the reinforcement learning self-adaptive multi-mode SLAM method based on 4D Gaussian splatter has the following advantages and beneficial effects:
1. On the basis of the existing multi-sensor fusion SLAM system, the method and the system select proper sensors in a self-adaptive mode according to specific environmental characteristics through reinforcement learning, have remarkable environmental adaptability and resource optimization capacity, improve positioning accuracy and map construction quality, reduce power consumption and calculation complexity, and improve robustness, efficiency and resource utilization rate of the system.
2. The application integrates the 4D Gaussian splatter technology into the multi-mode SLAM system, expands the scene adaptability of the SLAM system, and has obvious advantages in the aspects of rendering speed, memory efficiency, dynamic adaptability, robustness, resource efficiency, global consistency and the like compared with the traditional SLAM technology.
3. The 4D GS adopted by the application can efficiently represent and process complex changes in dynamic scenes by adding the time dimension on the basis of the 3D GS. Meanwhile, an optical flow is introduced into the 4D GS to solve the problem of over-fitting, so that the method can capture the motion and deformation of a dynamic object, can remarkably reduce the memory occupation and the calculation complexity through sparse Gaussian distribution representation, and improves the stability and generalization capability of a model.
4. According to the application, more priori information is provided through the optical flow, and the excessive fitting of the deformation field network of the 4D GS to noise in training data is limited, so that the deformation field network learns a Gaussian point deformation mode which is more reasonable and more in line with a physical rule, and the stability and generalization capability of the model are improved.
5. The application is particularly suitable for mobile terminal equipment with limited resources and complex and changeable dynamic scenes, and provides a more efficient and more accurate solution for the application of the SLAM system in the dynamic environment.
Drawings
The application will now be further described with reference to the following drawings;
FIG. 1 is a system framework diagram of an adaptive multi-modal SLAM method according to the present application;
FIG. 2 is a flow chart of the reinforcement learning sensor adaptive selection module;
FIG. 3 is a depth camera-IMU mode flow diagram;
FIG. 4 is a lidar-depth camera mode flow diagram;
FIG. 5 is a lidar-depth camera-IMU mode flowchart;
FIG. 6 is a 4D Gaussian splatter flow chart;
FIG. 7 is a Gaussian splatter diagram;
Detailed Description
In order to further illustrate the technical means adopted by the present application for achieving the preset design purpose, the following preferred embodiments are presented in conjunction with the accompanying drawings.
In the following description, specific details are set forth in order to provide a thorough understanding of the present invention. The present invention may be embodied in many other forms than described herein and similarly generalized to the embodiments described herein may be made by those skilled in the art without departing from the spirit of the invention and the invention is therefore not limited to the specific embodiments disclosed below.
As shown in fig. 1, a system applying the adaptive multi-mode SLAM method of the present application includes a reinforcement learning sensor selection module, a data fusion processing module, a map updating and back-end optimizing module, a loop detection and global map optimizing module. The SLAM system takes three sensor types, namely depth camera-IMU, lidar-depth camera, and lidar-IMU-depth camera.
As shown in fig. 2 to 7, the reinforcement learning adaptive multi-modal SLAM method based on 4D gaussian splatter includes the steps of:
step 1, multi-source sensor input and data preprocessing;
Receiving point cloud data from a laser radar, RGB image data from a depth camera and IMU data, and performing data preprocessing;
Step 1.1, data input;
The SLAM system synchronizes the time of receiving the point cloud data from the laser radar, the RGB image data from the depth camera and the IMU data so as to ensure that all data under the same time stamp can be accurately corresponding;
step 1.2, time information is fused in;
to handle dynamic scenarios, time stamps are introduced during the data preprocessing phaseEach laser radar point cloud and each image frame are provided with a time stamp so that dynamic changes of a scene can be accurately tracked and compensated in subsequent processing, and therefore a necessary time background is provided for a deformation field network of the 4D GS;
step 1.3, calibrating multi-sensor external parameters;
calibrating external parameters between the sensors by using Kalibr frames;
Step 1.4, initializing a reinforcement learning module;
The method comprises the steps of initializing a reinforcement learning sensor selection module in a data preprocessing stage, and learning a strategy for selecting an optimal sensor under different scene conditions by analyzing historical data and environmental characteristics;
Step 2, selecting a sensor combination type;
Based on the reinforcement learning sensor selection module, the environment state is learned in real time through a reinforcement learning strategy network, including but not limited to data such as illumination, dynamic object density, sensor noise and the like, the most reliable sensor type is dynamically selected and the sensor is activated as required so as to adapt to a high-dynamic complex and changeable environment;
Step 2.1, designing a state space;
Defining environmental information that the reinforcement learning system can observe at each decision moment, for guiding the sensor selection strategy, including aspects,
Geometric dynamic characteristic design comprising laser radar point cloud density gradient and dynamic object coverage rate;
the laser radar point cloud density gradient reflects the density change of the point cloud in space, helps the system to perceive the geometric structure and potential obstacle of the environment, quantifies the observation confidence of the laser radar in the local area, has the expression as follows,
(1)
Wherein,Represents the number of lidar hits within a unit voxel,The volume of the voxel is represented and,Representing a range of normalized values of the reflected intensity;
calculating the coverage rate of dynamic objects to represent the proportion and distribution of the dynamic objects in the scene, so that the system can know the dynamic characteristics of the environment in time, the expression is as follows,
(2)
Wherein,Representing an estimated velocity vector of the dynamic object,Representing the projected area of the dynamic object detection frame,Representing a field of view area of the sensor;
the two characteristics provide key information for the reinforcement learning module together, so that the reinforcement learning module can optimize a sensor selection strategy according to the static and dynamic characteristics of the environment, and the adaptability and the accuracy of the system in a complex environment are improved;
Sensory reliability, including IMU confidence and radar-visual depth consistency, the IMU confidence reflects the confidence of the inertial measurement unit data, calculated according to its noise level and bias stability, the higher the result value, the more reliable the IMU data is indicated, the expression is as follows:
(3)
Wherein,Representing the zero offset estimation error of the gyroscope,Represents accelerometer noise standard deviation (sliding window calculation);
The radar-vision depth consistency measures the consistency of the vision sensor and the radar depth data, the higher the consistency is calculated by comparing the depth measured values of the vision sensor and the radar depth data, the more coordinated the depth data of the two sensors is shown, and the expression is as follows:
(4)
Wherein,Representing a histogram of the observed depth distribution,Representing a depth distribution histogram of the depth camera,Representing the cross ratio of the effective areas of the two;
The sensor data reliability enhancement method and the sensor data reliability enhancement system provide key information for the reinforcement learning sensor module together, assist the reinforcement learning sensor module to optimize sensor selection and weight distribution, and improve the perceptibility and decision accuracy of the system in a complex environment;
The system resource state monitoring comprises real-time computing power load and energy residual budget, wherein the real-time computing power load reflects the current computing pressure of the system and helps a module to determine whether the sensor data processing amount needs to be reduced or the computing task needs to be simplified, and the expression is as follows:
(5)
Wherein,Indicating the current GPU's used memory capacity,The total memory capacity of the GPU is represented,Representing that single frame rendering is time consuming,Representing a desired frame period;
The energy residual budget indicates the energy storage condition of the system, so that the module can preferentially select a sensor with lower energy consumption or adjust a working mode when the energy is limited, and the expression is as follows:
(6)
Wherein,Indicating the current remaining amount of battery,Indicating the initial total amount of power,Indicating that the system has been in operation for a time,Representing an electrical quantity attenuation factor;
The combined action of the two can ensure that the system meets the performance requirement and simultaneously realizes the efficient utilization of resources and the reasonable management of energy consumption;
The space-time context comprises a historical decision memory and a scene category probability, wherein the historical decision memory records the success rate and effect of past sensor selection decisions, helps a module to learn from experience, avoids repeated errors and improves the decision efficiency, and the expression is as follows:
(7)
Wherein,Indicating the past thThe step-by-step action is performed,Representing a long-short term memory network;
The scene category probability calculates the possibility that the scene belongs to different categories, such as long corridor, weak texture area and the like, according to the current environmental characteristics, so that the module can adjust the sensor selection strategy according to the scene characteristics, and the expression is as follows:
(8)
Wherein,Representing the image-text joint embedding vector extracted by the CLIP model,Representing a matrix of the weight that can be learned,To normalize the distribution, the sum is 1;
The combination of the sensor and the sensor can make the sensor selection more intelligent and has strong adaptability;
Step 2.2, defining an action space formula;
including defining all possible actions that the system can perform, i.e., adjusting sensor parameters to optimize data acquisition and system performance;
The SLAM system adopts a mixed discrete-continuous space to realize fine control, and the mode selection part is discrete and comprises three sensor leading modes, namely a mode 1 is a depth camera-IMU mode and is suitable for the conditions of better illumination conditions and relatively stable environment structure, a mode 2 is a laser radar-depth camera mode and is suitable for environments with more dynamic objects, low illumination or no illumination scenes and the like, a mode 3 is a laser radar-IMU-depth camera mode and is suitable for complex and changeable environments, long-time running systems, high-precision positioning requirement scenes and the like;
The weight distribution part is continuous and comprises a sensor gain coefficient and a resource limiting parameter, wherein the sensor gain coefficient is the weight of the laser radar, the depth camera and the IMU, and the weight range is 0 to 1, and the expression is as follows
(9)
The resource limiting parameter is the laser radar sampling rate, the range is between 5Hz and 40Hz, and the expression is as follows:
(10)
The action space formula provides a mathematical framework for adjusting the sensor configuration for the system, so that the system can flexibly decide in a dynamic environment, and balance the data quality and the resource consumption, thereby improving the overall performance of the system;
step 2.3, rewarding a function module;
the SLAM system is guided to learn an optimal sensor selection strategy to achieve overall optimization of system performance. The positioning accuracy and the map construction quality can be improved, the resource utilization efficiency is considered, excessive consumption is avoided, and unreasonable sensor selection is prevented by punishing item constraint system behaviors, so that the whole SLAM system is ensured to stably and efficiently operate in complex and changeable environments.
Including primary rewards, auxiliary rewards, and penalty items;
The main rewards comprise positioning accuracy rewards and energy efficiency economic rewards which are respectively used for improving the positioning accuracy of the system and the energy utilization efficiency, and the expression is as follows:
(11)
(12)
where ATE represents the absolute track error, and,Representing the total instantaneous power consumption,A bonus compensation item representing when the remaining amount of power exceeds a threshold;
step 2.3.2 auxiliary rewards;
the method comprises rendering quality rewards and strategy stability rewards, aims at improving visual effects of scene reconstruction and ensuring continuity and reliability of strategies, and has the following expression:
(13)
(14)
Wherein SSIM represents a structural similarity index, LPIPS represents learning-aware image block similarity,Representing the penalty of the action mutation,Representing policy network parameter smoothness constraints;
the punishment items comprise fatal punishment errors and resource overload punishment and are used for avoiding unreasonable sensor selection and excessive consumption of resources, so that a system is guided to learn an optimal sensor selection strategy to realize comprehensive optimization of system performance, and the expression is as follows:
(15)
(16)
The structure can realize that the reward function comprehensively guides the system to optimize the selection of the sensor, thereby improving the overall performance;
The comprehensive rewards calculation formula is as follows:
(17)
step 2.4, designing a network architecture;
as the core of the reinforcement learning sensor selection module, it can decide how the system processes multi-modal sensor data and makes intelligent decisions, including,
A state encoder, which is the basis of the whole network, is responsible for converting the multimodal sensor data into a unified representation of features so that subsequent policy and value networks can make decisions and evaluations based on these features, including,
A. Backbone network-using ResNet-18 architecture to handle visual features. ResNet-18 is a classical depth residual error network, can effectively extract advanced semantic information in an image, simultaneously relieves the degradation problem of a deep network through residual error connection, and ensures the training effect and the feature extraction capability of the network.
B. And (3) branching the network, namely extracting geometric features by using a point cloud transducer. The transducer architecture has advantages in processing sequence data, can capture global dependency in point cloud data, extracts more representative geometric features, and is helpful for a system to understand the three-dimensional structure of the environment.
C. Timing network-processing IMU sequences using Bi-LSTM. Bi-LSTM can simultaneously consider past and future information of IMU data, capture two-way dependency relationships in a time sequence, thereby modeling dynamic characteristics of the IMU data more accurately and providing stable attitude estimation for the system.
And the strategy network is responsible for outputting specific action selection according to the coded state characteristics, wherein the specific action selection comprises discrete action modes and continuous parameter adjustment. Wherein,
A. mode selection branch, adopting Gumbel-Softmax to output discrete action. Gumbil-Softmax is a technique that balances between continuous relativity and discrete sampling, so that the network can effectively learn the choice of discrete actions in the training process, and simultaneously maintain the transmissibility of gradients, and is suitable for discrete decision tasks such as selecting sensor dominant modes.
B. and a parameter adjusting branch, which outputs continuous weights by using a Tanh activation function. The Tanh activation function can limit the output to the range of [ -1, 1], and by appropriate scaling and offset, it can be mapped to a desired continuous parameter space, such as a sensor gain coefficient and a sampling rate parameter, etc., to achieve fine adjustment of the sensor parameters.
The value network is used for evaluating the advantages and disadvantages of the current state and the action combination, providing a learning signal for the strategy network to help the strategy network optimize the decision process,
Dueling DQN structure: separation status value and dominance function. The Dueling DQN structure divides the value network into two parts, respectively estimates the state cost function (V) and the dominance function (A), and then obtains the final Q value through a specific combination mode. The separation structure enables the network to evaluate the relative advantages and disadvantages of different actions in the current state more accurately, and learning efficiency and stability are improved.
B. Multi-head attention dynamically weights multi-modal features. The multi-head attention mechanism can pay attention to different aspects of different modal characteristics at the same time, dynamically adjust the weight of each modal characteristic according to the current task demand, realize effective fusion of multi-modal information and enhance the adaptability of the network to complex environments.
Step 2.5, executing a training strategy;
through an efficient data collection mechanism, an intelligent exploration strategy and strict safety learning constraint, a comprehensive training framework is provided for the self-adaptive selection module of the reinforcement learning sensor, which comprises,
The data collection mechanism provides experience data for interaction with the environment for the system and is the basis for learning and optimizing strategies. In particular, the method comprises the steps of,
A. and (3) deploying an environment, namely selecting Gazebo simulation environments, providing high-fidelity scene simulation and flexible sensor configuration options, and testing and training a robot algorithm.
B. The sampling frequency is set to 120FPS real-time sampling, and the high frame rate captures finer environmental dynamics and robot motion state, so that richer information is provided for subsequent data processing and strategy learning.
The exploration strategy determines how the system can effectively explore in an unknown or partially known environment to obtain more information and experience. In particular, the method comprises the steps of,
A. Adaptive adaptationGreddy as the training period increases,The value gradually decreases, the system is transited from wide exploration to the utilization of learned knowledge, the exploration and the utilization are balanced, the strategy performance is improved, and the expression is as follows:
(18)
b. Adding state prediction error rewards, namely estimating the next state through a prediction model and taking the prediction error as the intrinsic rewards, wherein the expression is as follows:
(19)
Wherein,Is an intrinsic reward indicating that the system is in stateTake action downwardsThe intrinsic rewards that are obtained at the time,Is an adjustment factor for controlling the intensity of the intrinsic rewards; is a prediction of the next state by the system, estimated by a forward model,Is the actual next state feature;
Safety learning constraints ensure that the system not only pursues high performance in the learning and decision process, but also meets a series of safety and practicality requirements. In particular, the method comprises the steps of,
Action Masking, which disables the selection of high power actions that exceed the battery capacity, i.e., masking out those actions in the action space that would cause the battery to drain quickly. The system is ensured to consider the energy limitation in the decision process, unreasonable high-power consumption behaviors are avoided, and the practicability and the sustainability of the system are improved.
B. Policy gradient correction, the expression is as follows:
(20)
Wherein, thereinA gradient of probability distribution representing policy network output actions relative to network parameters,Is a function of the cost of the action,Is a baseline function.
The application enables the sensor decision to achieve dynamic balance in time-space-energy triple dimensionality through the hierarchical state representation, the mixed action space and the depth combination of the multi-objective rewarding function, thereby realizing the advantages of achieving dynamic environment adaptation, resource optimization management, improving positioning and mapping precision, enhancing system robustness and supporting long-term stable operation of the SLAM system.
Step 3, multi-source sensor data fusion processing;
based on the data fusion processing module, the sensor is used for carrying out multi-source sensor data fusion processing, so that rich and accurate data are provided for the construction of the Gaussian map;
The reinforcement learning sensor selection module adaptively selects the most suitable sensor mode according to the influence of each parameter on the sensor mode selection according to the current environmental condition and the system state, wherein the weight function expression is as follows:
+++++(21)
Wherein,,,,,AndRespectively represent the laser radar confidenceDynamic object coverageIMU confidenceRadar vision consistencyCalculating force loadAnd energy surplus budgetHigh weight;
comprising the following steps:
step 3.1, a depth camera-IMU mode;
When the reinforcement learning sensor selection module is based on the current environmental characteristics, the laser radar confidence levelLow dynamic object coverageLow IMU confidenceHigh radar vision consistencyLow computational loadModerate and energy residual budgetWhen high, for example, under the environments of better illumination condition, relatively stable environment structure, more mirror objects, fewer dynamic objects with rich textures, and the like, the method is based on the weight function S ∈,In order to judge the threshold value, when judging that the SLAM system selects the mode 1, namely adopting a sensor combination of a depth camera and an IMU, taking the depth camera as a leading part and taking the IMU as an auxiliary part;
step 3.1.1, data input and key frame selection;
The frames containing new scene structures or salient feature increases are preferentially selected as key frames, so that the integrity and efficiency of the map are improved;
step 3.1.2, performing point cloud registration by adopting generalized ICP tracking (GICP);
select the firstRGB image of frameAnd depth imageGenerating a point cloudWherein each pointAnd calculates covariance matrix of each point. Estimating a current frame source point cloud by GICP algorithmPoint cloud with target mapRelative pose transformation between themAnd includes the steps of (a) a base,
A. distribution distance calculation modeling each point as a gaussian distributionThe source point cloud is transformedAfter that, distance from the target point cloudThe definition is as follows:
(22)
The distribution is as follows:
(23)
Wherein,,Representing coordinates of corresponding points in the target map and the source point cloud,,Representing a target point cloud and a source point cloud covariance matrix;
b. maximum likelihood estimation by maximizing the log likelihood of the probability density function to solve for the optimal transformation:
(24)
The optimization objective is reduced to minimize the mahalanobis distance, expressed as follows:
(25)
step 3.1.3 IMU pre-integration;
The IMU high-frequency measurement value is fused to generate the interframe motion pre-integral quantity as the initial guess of GICP, the tracking efficiency is improved, the system robustness and accuracy are enhanced by combining with GICP algorithm,
A. constructing an IMU measurement model original measured value:
(26)
(27)
Wherein,,Representing raw acceleration and angular velocity measurements,Representing the rotation of the IMU to the world coordinate system,Representing the gravity vector of the gravity force,/Indicating that the sensor is biased in a direction,/Representing sensor noise;
b. performing motion pattern recursion and positionSpeed ofRotatingThe recurrence formula of (2) is as follows:
(28)
(29)
(30)
c. calculating an IMU pre-integral quantity:
To avoid repeated integration, define from the firstFrame to the firstThe amount of inter-frame relative motion of a frame is expressed as follows:
(31)
(32)
(33)
Wherein,,,Representing the product of position, velocity, and rotation, respectively,An oblique symmetry matrix representing angular velocity
D. Computing relative transformations between successive framesUsing external parameters between the camera and IMU sensorThe relative transformation is converted into camera coordinates. Obtaining GICP a good initial guess of tracking from IMU pre-integration;
(34)
(35)
Step 3.1.4, updating the pre-integration;
The result of GICP optimization is used to correct the IMU integral drift over time due to noise and bias, including,
A. camera pose optimizing GICPThe expression is as follows:
(36)
Wherein,Representing the camera's external parameters to the IMU,AndThe results of GICP tracking position and rotation are shown, respectively.
B. Updating the IMU state quantity:,, (37)
Step 3.2 lidar-depth camera mode;
When the reinforcement learning sensor selection module is based on the current environmental characteristics, the laser radar confidence levelHigh dynamic object coverageHigh IMU confidenceLow radar vision consistencyMedium, calculated force loadLarger and energy surplus budgetThe adaptive selection module of the medium-time reinforcement learning sensor is used for selecting the medium-time reinforcement learning sensor according to the current environment, such as low-illumination environment and low-texture environment, and according to the weight function,In order to judge the threshold value, when judging that the SLAM system selects the mode 2, adopting a sensor combination of a laser radar and a depth camera, taking the depth camera as a leading part and taking the laser radar as an auxiliary part,
Step 3.2.1 data input an image from a depth camera and a point cloud from a lidar;
Integration is performed by using calibrated external signals, and time aligned LiDAR point clouds are converted into depth images, wherein the expression is as follows:
(38)
Wherein,In the form of a point cloud of a lidar,AndThe rotation matrix and translation vector of the lidar to the camera coordinate system,Is an intrinsic matrix of the camera;
Step 3.2.2 using an incremental error minimization function;
ensuring accurate correspondence between planes and points, the formula is as follows:
(39)
Wherein,Representing a lidar point cloudIs provided with a plurality of points in the middle,Based on the current attitude estimation passing from the last moment to the world coordinate systemAs a result of the number of iterations,Is closest toIs arranged in the center of the gaussian,Is thatIs defined in the specification.Is a dotIs used for the weight of the (c),Is a regularization term used for enhancing the stability and precision of the error function, and considers the direction error between normal vectors;
Introducing regularization termTo enhance the stability and accuracy of the error function and to take into account the error in the normal direction, the expression is as follows:
(40)
Wherein,Is the normal of the current Gaussian distribution;
step 3.2.3 weight function calculation;
To distinguish between gaussian points generated solely by color supervision and gaussian points generated simultaneously by lidar depth, the system incorporates a weighting function. The weighting function combines the consistency of the normal vector, the density factor and the texture complexity to evaluate the reliability of different gaussian points. The weight function is calculated as follows:
a. determining the gaussian center within the partial sphere: find all nearest gaussian distribution centers in it, whereIs a sphere center, the spherical surface is provided with a plurality of grooves,Is the radius;
b. calculating a density function of Gaussian points, the density functionCalculated by the following formula:
(41)
Wherein,Is a reconstructed covariance matrix by selecting the smallest variance along the normal directionAnd greater variance in the vertical directionTo construct.
C. simplifying the calculation of the density function, and simplifying the calculation of the density function in the tracking process in order to accelerate the calculation speed:
(42)
d. consistency calculations, for each pointCalculating the normal of the current Gaussian distributionFrom local average normalConsistency of (2);
E. And calculating the complexity texture, namely calculating the local texture complexity for the image area corresponding to each radar point. Radar pointProjecting to a pixel coordinate system of a camera, and acquiring the position of the pixel coordinate system in an imageTaking the same as the center to intercept(=16), Converting the image block into a gray map, calculating the variance of pixel intensities:
(43)
Wherein,For the intensity of the pixel(s),An image block average value;
converting the variance to a texture weight of 0-1 by a sigmoid function:
(44)
Wherein,For a scaling factor, for adjusting variance sensitivity;
f. final weight function, defining final weight functionIs the product of normal consistency, density function and texture complexity, namely;
Step 3.3, a multi-sensor equalization mode of the laser radar-IMU-depth camera;
When the reinforcement learning sensor selection module is based on the current environmental characteristics, the laser radar confidence levelMedium, dynamic object coverageHigh IMU confidenceHigh radar vision consistencyLow computational loadLarger and energy surplus budgetWhen high, such as complex and changeable dynamic environment or high-precision positioning requirement environment running for a long time, the weight function is used for controlling the position of the objectWhen judging that the SLAM system selects the mode 3, the sensor mode of the combination of the laser radar, the depth camera and the IMU is adopted, and comprises,
Step 3.3.1, synchronizing the data input with the hardware;
the laser radar provides sparse but high-precision 3D point cloud, the depth camera captures RGB textures of a scene, and the IMU outputs angular speed and acceleration at high frequency for motion prediction, so that strict alignment of data time stamps of the three are ensured, and time sequence drift is avoided;
Step 3.3.2, key frame selection is carried out through depth camera input;
selecting representative frames from the continuous data stream as key frames, reducing redundancy calculation;
step 3.3.3. IMU data is used for state propagation;
predicting the current state through IMU pre-integration according to the input IMU data and the state of the last key frame, and predicting the current state through IMU pre-integration for state estimation and performing forward prediction on motion de-distortion;
step 3.3.4, de-distorting the laser radar input;
Using IMU predicted continuous pose to make each laser radar pointThe method comprises the following steps of converting a local coordinate system to a global coordinate system at the scanning moment, wherein the expression is as follows:
(45)
Wherein,Representation pointsIs used for the acquisition of the time stamp of (a),Representing the result obtained by IMU interpolationThe pose at the moment.
And 3.4, after the reinforcement learning sensor selection module executes actions and interacts with the environment, recording historical decision memory through the rewarding function module, evaluating through the value network, and optimizing updating of the whole strategy network.
Step 4, updating and optimizing the map;
Based on a map updating and rear-end optimizing module, utilizing rich map data provided by a front-end multi-mode sensor data fusion processing module, carrying out Gaussian map updating and optimizing on an environment map by combining 4D GS, and carrying out error correction and map optimizing through loop detection;
Comprises the steps of,
Step 4.1, maintaining a sliding window;
In the data fusion processing module, the system maintains a sliding window that screens and selects the point cloud from the nearest 10 time frames in the gaussian map to construct gaussian points while masking out the remaining gaussian points. This selection process ensures that the gaussian points are correlated in the sub-map of current interest;
Step 4.2, 4D Gaussian distribution;
The gaussian deformation field network is used for modeling the movement and shape change of the gaussian distribution of the dynamic object. The network consists of an efficient space-time structure encoder and a multi-headed gaussian deformation decoder. The method aims at realizing high-efficiency representation and real-time rendering of a dynamic scene by learning a Gaussian deformation field and transforming a standard 3D Gaussian distribution into a new position and shape and passing through a Gaussian deformation field network of 4D GSPredicting deformation of current frame point cloudThe expression is as follows:
(46)
Wherein,Representing the three-dimensional gaussian function after deformation.
In particular, a space-time structure encoderThe goal of (2) is to efficiently encode the spatial and temporal characteristics of the 3D gaussian distribution. It consists of a multi-resolution HexPlane module and a small multi-layer perceptron MLP.
The multi-resolution HexPlane module is used to efficiently encode the spatial and temporal features of the 3D gaussian distribution. It is achieved by decomposing a 4D neural voxel into multiple 2D planes, which can be sampled and encoded at different resolutions. Extracting the center coordinates of the 3D Gaussian distribution G according to the input Gaussian map and the time stamp tAnd a timestamp t, by querying a multi-resolution flat moduleVoxel features are acquired and queried using bilinear interpolation. Comprising 6 multi-resolution planar modulesWhich is provided withRespectively is,Representing the resolution level.
Each planar moduleIs defined asWhereinIs the hidden dimension of the feature and,Is the basic resolution of the voxel grid.
The voxel characteristics are queried through bilinear interpolation, and the formula is as follows:
(47)
Wherein,Is a feature of a nerve voxel.
Small MLPFor and all the features:
(48)
Wherein,Is the final feature representation.
In particular, a multi-headed gaussian deformation decoder D is used to decode the deformation of each 3D gaussian distribution from the characteristics obtained by the encoder. It consists of three independent MLPs, calculating the position, rotation and scaling deformations, respectively.
The position of the deformation head is changed,For calculating position distortionThe formula is:
(49)
The deformation head is rotated to be in contact with the surface of the workpiece,For calculating position distortionR, the formula is:
(50)
the deformation head is scaled and deformed,For calculating scaling deformationsThe formula is:
(51)
Applying these deformations to the original 3D Gaussian distribution to obtain a deformed Gaussian distributionThe formula is:
(52)
Wherein,Is a new location for the user to be able to locate,Is a new rotation, and is a new rotation,Is a new scaling.
Step 4.3, introducing an optical flow to solve the 4D GS overfitting problem;
In the process of real-time operation of the SLAM system, the 4D GS has the problem of overfitting for the construction of an environment map for a large amount of newly-appearing data and the influence of noise in the data.
Specifically, an optical flow computing mode is adopted to capture the motion information of pixels in the time dimension, additional time consistency constraint is provided for the model, and a RAFT optical flow algorithm is utilized to calculate the pixel motion between adjacent time stamps. Comprises a plurality of steps of the method, including the steps of,
Step 4.3.1, inputting an image;
For each pair of adjacent time stamped imagesAndComputing optical flow using pre-trained RAFTThe 4D GS predicted pixel motion is;
Step 4.3.2, constraining the deformation of the 3D Gaussian points through optical flow;
Ensuring that the dynamic part of model prediction is consistent with the optical flow, and constructing a loss function of optical flow constraint, wherein the expression is as follows:
(53)
Wherein,Represents the observed optical flow from the RGB image calculated by RAFT,Representing pixel-level motion prediction based on 4D GS deformation field rendering;
Step 4.3.3, predicting the change of the Gaussian parameters based on the deformation field network of the 4D GS;
deformable field network using 4D GSPredicting a change in a Gaussian parameter, the parameter including a change in positionVariation of rotationScaling changesFor the kth Gaussian, the position after deformation adopts the following expression:
(54)
Wherein,The position of the initial gaussian is indicated,A positional shift representing deformation field prediction;
step 4.3.4 projects the gaussian to the image plane through a differentiable SPLATTING process, calculating the pixel-level motion;
The expression is as follows:
(55)
step 4.3.5 introducing optical flow confidence to filter unreliable optical flow predictions;
to reduce noise in optical flow algorithms, confidence maps provided by optical flow algorithms are usedThe optical flow constraint loss is only applied to the region with higher confidence, and the expression is as follows:
(56)
Wherein,Representing optical flow confidence in locationIs a value of (2);
step 4.4, updating and optimizing the map;
Initializing a static 3D Gaussian distribution using Structure-from-Motion (SfM) method, optimizing only the static 3D Gaussian for the first 3000 iterations, then using the 3D GaussianInstead of 4D GaussianImage rendering is carried out, reasonable initial 3D Gaussian distribution is learned, dynamic and static parts are separated, the pressure of large deformation learning is reduced, and the problem of unstable numerical value when a deformation field network is directly optimized is avoided;
For the construction of the loss function, use is made in the reconstruction processColor loss supervised training process, optical flow constraint loss and grid-based total differential lossAlso applied are the following expressions:
(57)
Wherein,AndA weight parameter representing optical flow constraint loss and grid-based total differential loss for balancing the effects of the different losses;
Step 4.5 loop detection;
The module can detect potential loop candidates by extracting laser radar and visual features and generating feature descriptors;
confirming a loop hypothesis by using geometric verification and consistency check;
and feeding the verified loop constraint back to the map optimization process to globally optimize the map structure.
Through the loop detection step, the robustness and the efficiency of the SLAM system can be obviously improved, and the map quality and the positioning accuracy in long-time operation are ensured.
Similar technical solutions can be derived from the solution content presented in connection with the figures and description, as described above. But all the solutions without departing from the structure of the present application still fall within the scope of the claims of the technical solution of the present application.

Claims (10)

Translated fromChinese
1.一种基于4D高斯泼溅的强化学习自适应多模态SLAM方法,其特征在于:包括下述1. A reinforcement learning adaptive multimodal SLAM method based on 4D Gaussian splashing, characterized by: comprising the following步骤,step,步骤1、多源传感器输入与数据预处理;Step 1: Multi-source sensor input and data preprocessing;接收来自激光雷达的点云数据、来自深度相机的RGB图像数据和IMU数据,进行数据预处理;Receive point cloud data from the LiDAR, RGB image data from the depth camera, and IMU data for data preprocessing;步骤2、选择传感器组合类型;Step 2: Select the sensor combination type;基于强化学习传感器选择模块,通过强化学习策略网络实时学习环境状态,动态选择最可靠的传感器组合类型并按需激活传感器;Based on the reinforcement learning sensor selection module, the reinforcement learning strategy network learns the environment status in real time, dynamically selects the most reliable sensor combination type and activates the sensor on demand;步骤3、多源传感器数据融合处理;Step 3: Multi-source sensor data fusion processing;基于数据融合处理模块,由传感器组合进行多源传感器数据融合处理;Based on the data fusion processing module, the sensor combination performs multi-source sensor data fusion processing;步骤4、地图更新和优化;Step 4: Map update and optimization;基于地图更新和后端优化模块,利用前端数据融合处理模块提供的地图数据、结合4DGS对环境地图进行高斯地图更新和优化,并通过回环检测进行误差校正和地图优化。Based on the map update and back-end optimization module, the environment map is updated and optimized by Gaussian map using the map data provided by the front-end data fusion processing module in combination with 4DGS, and error correction and map optimization are performed through loop detection.2.根据权利要求1所述的基于4D高斯泼溅的强化学习自适应多模态SLAM方法,其特征在于: 所述的步骤1包括下述步骤,2. The method of reinforcement learning adaptive multimodal SLAM based on 4D Gaussian splashing according to claim 1, characterized in that: said step 1 comprises the following steps,步骤1.1数据输入;Step 1.1 Data input;SLAM系统对接收来自激光雷达的点云数据、来自深度相机的RGB图像数据和IMU数据在时间上进行同步;The SLAM system synchronizes the point cloud data received from the LiDAR, the RGB image data from the depth camera, and the IMU data in time;步骤1.2时间信息融入;Step 1.2: Incorporate time information;在数据预处理阶段引入时间戳,每个激光雷达点云和图像帧都带有时间戳;Introducing timestamps in data preprocessing ,Each lidar point cloud and image frame is timestamped;步骤1.3多传感器外参标定;Step 1.3 Multi-sensor external parameter calibration;使用Kalibr框架标定传感器之间的外参;Use the Kalibr framework to calibrate the external parameters between sensors;步骤1.4强化学习模块初始化;Step 1.4: Initialize the reinforcement learning module;基于强化学习传感器选择模块在数据预处理阶段进行初始化,通过分析历史数据和环境特征,学习在不同场景条件下选择最优传感器组合的策略;初始状态下,采用随机探索策略。The reinforcement learning-based sensor selection module is initialized in the data preprocessing stage. By analyzing historical data and environmental characteristics, it learns the strategy of selecting the optimal sensor combination under different scenario conditions. In the initial state, a random exploration strategy is adopted.3.根据权利要求1所述的基于4D高斯泼溅的强化学习自适应多模态SLAM方法,其特征在于:所述的步骤2包括下述步骤,3. The reinforcement learning adaptive multimodal SLAM method based on 4D Gaussian splashing according to claim 1 is characterized in that: the step 2 comprises the following steps:步骤2.1状态空间设计;Step 2.1 state space design;定义强化学习系统在每个决策时刻能够观察到的环境信息;Define the environment information that the reinforcement learning system can observe at each decision moment;步骤2.2定义动作空间公式;Step 2.2 defines the action space formula;包括定义系统切换传感器模式的动作,以优化数据获取和系统性能;This includes defining actions for the system to switch sensor modes to optimize data acquisition and system performance;SLAM系统采用混合离散-连续空间以实现精细化控制,模式选择部分是离散的,包括三种传感器主导模式:深度相机-IMU模式,激光雷达-深度相机模式,激光雷达-IMU-深度相机模式;The SLAM system uses a hybrid discrete-continuous space to achieve refined control. The mode selection part is discrete, including three sensor-dominated modes: depth camera-IMU mode, lidar-depth camera mode, lidar-IMU-depth camera mode;权重分配部分是连续的,包括传感器增益系数和资源限制参数,传感器增益系数为激光雷达、深度相机和IMU的权重,范围在0到1之间;表达式如下:The weight allocation part is continuous, including sensor gain coefficient and resource limitation parameters. The sensor gain coefficient is the weight of the lidar, depth camera and IMU, ranging from 0 to 1; the expression is as follows: (9) (9)资源限制参数为激光雷达采样率,范围在5Hz到40Hz之间;表达式如下:The resource limiting parameter is the lidar sampling rate, which ranges from 5 Hz to 40 Hz; the expression is as follows: (10) (10)步骤2.3奖励函数模块;Step 2.3 Reward function module;包含主奖励、辅助奖励和惩罚项;Contains main rewards, auxiliary rewards and penalty items;步骤2.4网络架构设计;Step 2.4 Network architecture design;包括状态编码器,作为整个网络的基础,其负责将多模态的传感器数据转换为统一的特征表示,以便后续的策略网络和价值网络能够基于这些特征进行决策和评估;Including the state encoder, as the basis of the entire network, which is responsible for converting multimodal sensor data into a unified feature representation so that the subsequent policy network and value network can make decisions and evaluations based on these features;策略网络,负责根据编码后的状态特征,输出具体的动作选择,包括离散的动作模式和连续的参数调节;The policy network is responsible for outputting specific action selections based on the encoded state features, including discrete action modes and continuous parameter adjustments;价值网络,用于评估当前状态和动作组合的优劣,为策略网络提供学习信号,帮助其优化决策过程;The value network is used to evaluate the pros and cons of the current state and action combination, providing learning signals for the policy network to help it optimize the decision-making process;步骤2.5执行训练策略;Step 2.5 executes the training strategy;通过高效的数据收集机制、智能的探索策略和严格的安全学习约束,为强化学习传感器自适应选择模块提供了全面的训练框架;包括,Through efficient data collection mechanisms, intelligent exploration strategies, and strict safety learning constraints, a comprehensive training framework is provided for the reinforcement learning sensor adaptive selection module; including,数据收集机制,为系统提供了与环境交互的经验数据,是学习和优化策略的基础;The data collection mechanism provides the system with experience data on interactions with the environment, which is the basis for learning and optimizing strategies;探索策略,决定系统如何在未知或部分已知的环境中进行有效的探索,以获取更多的信息和经验;Exploration strategy, which determines how the system can effectively explore unknown or partially known environments to gain more information and experience;安全学习约束,确保系统在学习和决策过程中不仅追求高性能,还要满足一系列安全和实用性要求。Safe learning constraints ensure that the system not only pursues high performance but also meets a series of safety and practicality requirements during the learning and decision-making process.4.根据权利要求3所述的基于4D高斯泼溅的强化学习自适应多模态SLAM方法,其特征在于:所述的步骤2.1包括有,4. The method of reinforcement learning adaptive multimodal SLAM based on 4D Gaussian splashing according to claim 3 is characterized in that: the step 2.1 includes:几何动态特征设计,包含激光雷达点云密度梯度和动态物体覆盖率;Geometric dynamic feature design, including LiDAR point cloud density gradient and dynamic object coverage;激光雷达点云密度梯度反映点云在空间中的密集程度变化,帮助系统感知环境的几何结构和潜在障碍物,量化激光雷达在局部区域的观测置信度,表达式如下,The density gradient of the lidar point cloud reflects the density change of the point cloud in space, helps the system perceive the geometric structure and potential obstacles of the environment, and quantifies the observation confidence of the lidar in the local area. The expression is as follows: (1) (1)其中,表示单位体素内的激光雷达命中点数,表示体素体积,表示反射强度归一化值范围;in, Indicates the number of lidar hit points within a unit voxel, represents the voxel volume, Indicates the normalized value range of reflection intensity;计算动态物体覆盖率表示场景中动态物体所占的比例和分布情况,使系统能够及时了解环境的动态特性,表达式如下,Calculating the dynamic object coverage rate represents the proportion and distribution of dynamic objects in the scene, so that the system can understand the dynamic characteristics of the environment in a timely manner. The expression is as follows: (2) (2)其中,表示动态物体的估计速度向量,表示动态物体检测框的投影面积,表示传感器的视场区域;in, represents the estimated velocity vector of the dynamic object, Represents the projection area of the dynamic object detection box, Indicates the field of view area of the sensor;感官可靠性,包括IMU置信度和雷达-视觉深度一致性,IMU置信度反映惯性测量单元数据的可信度,依据其噪声水平和偏差稳定性计算得出,结果数值越高则表明IMU数据越可靠,表达式如下:Sensory reliability, including IMU confidence and radar-vision depth consistency. IMU confidence reflects the credibility of inertial measurement unit data and is calculated based on its noise level and bias stability. The higher the result value, the more reliable the IMU data. The expression is as follows: (3) (3)其中,表示陀螺仪零偏估计误差,表示加速度计噪声标准差;in, represents the gyroscope bias estimation error, represents the standard deviation of accelerometer noise;雷达-视觉深度一致性则衡量视觉传感器与雷达深度数据的吻合度,通过比较两者的深度测量值计算,一致性越高,说明两种传感器的深度数据越协调;表达式如下:Radar-vision depth consistency measures the degree of agreement between the depth data from the vision sensor and the radar. It is calculated by comparing the depth measurements of the two. The higher the consistency, the more coordinated the depth data from the two sensors. The expression is as follows: (4) (4)其中,表示观测的深度分布直方图,表示深度相机的深度分布直方图,表示两者有效区域的交并比;in, represents the observed depth distribution histogram, represents the depth distribution histogram of the depth camera, It represents the intersection ratio of the effective areas of the two;系统资源状态监控,包含实时算力负载和能源剩余预算,实时算力负载反映系统当前的计算压力,帮助模块决定是否需要减少传感器数据处理量或简化计算任务;表达式如下:System resource status monitoring, including real-time computing load and remaining energy budget. Real-time computing load reflects the current computing pressure of the system and helps the module decide whether to reduce the amount of sensor data processing or simplify computing tasks. The expression is as follows: (5) (5)其中,表示当前 GPU已使用的显存容量,表示GPU总显存容量,表示单帧渲染耗时,表示期望帧周期;in, Indicates the video memory capacity currently used by the GPU. Indicates the total GPU memory capacity. Indicates the time taken to render a single frame. represents the expected frame period;能源剩余预算则指示系统的能量储备情况,使模块基于当前能源剩余预算选择传感器组合或调整工作模式;表达式如下:The remaining energy budget indicates the energy reserve of the system, which enables the module to select the sensor combination or adjust the working mode based on the current remaining energy budget; the expression is as follows: (6) (6)其中,表示当前电池剩余量,表示初始总电量,表示系统已运行时间,表示电量衰减因子;in, Indicates the current remaining battery level. Indicates the initial total power, Indicates the system running time. Indicates the power attenuation factor;时空上下文,包含历史决策记忆和场景类别概率,历史决策记忆记录了过去传感器选择决策的成功率和效果,帮助模块从经验中学习,避免重复错误,提高决策效率;表达式如下:Spatiotemporal context, including historical decision memory and scene category probability. Historical decision memory records the success rate and effect of past sensor selection decisions, helping the module learn from experience, avoid repeated mistakes, and improve decision efficiency. The expression is as follows: (7) (7)其中,表示过去第步动作,表示长短期记忆网络;in, Indicates the past Step action, represents the long short-term memory network;场景类别概率则根据当前环境特征,计算属于低纹理场景、有限视角约束场景和动态场景的概率,使模块能根据场景特性调整传感器选择策略;表达式如下:The scene category probability calculates the probability of belonging to low-texture scenes, limited-viewing-constrained scenes, and dynamic scenes based on the current environmental characteristics, so that the module can adjust the sensor selection strategy according to the scene characteristics; the expression is as follows: (8) (8)其中,表示CLIP 模型提取的图像-文本联合嵌入向量,表示可学习权重矩阵,为归一化该分布,和为1。in, represents the joint image-text embedding vector extracted by theCLIP model, represents the learnable weight matrix, To normalize the distribution, the sum is 1.5.根据权利要求3所述的基于4D高斯泼溅的强化学习自适应多模态SLAM方法,其特征在于:所述的步骤2.3包括,5. The reinforcement learning adaptive multimodal SLAM method based on 4D Gaussian splashing according to claim 3 is characterized in that: the step 2.3 comprises:主奖励包括定位精度奖励和能效经济奖励,分别用于提升系统定位的准确性和能源利用效率;表达式如下:The main rewards include positioning accuracy rewards and energy efficiency economic rewards, which are used to improve the accuracy of system positioning and energy efficiency respectively; the expressions are as follows: (11) (11) (12) (12)其中,ATE表示绝对轨迹误差,表示总瞬时功耗,表示电量剩余超过阈值时的奖励补偿项;Where ATE represents the absolute trajectory error, represents the total instantaneous power consumption, Indicates the reward compensation item when the remaining power exceeds the threshold;辅助奖励,包括渲染质量奖励和策略稳定奖励,旨在提高场景重建的视觉效果和确保策略的连续性与可靠性;表达式如下:Auxiliary rewards, including rendering quality rewards and policy stability rewards, are designed to improve the visual effect of scene reconstruction and ensure the continuity and reliability of the strategy; the expression is as follows: (13) (13) (14) (14)其中,SSIM表示结构相似性指数,LPIPS表示学习感知图像块相似度,表示动作突变惩罚,表示策略网络参数平滑性约束;Among them, SSIM represents the structural similarity index, LPIPS represents the learning-perceptual image block similarity, represents the penalty for action mutation, Represents the smoothness constraint of policy network parameters;惩罚项,包括致命惩罚错误和资源过载惩罚,用于避免不合理的传感器选择和资源的过度消耗,从而引导系统学习到最优的传感器选择策略,实现系统性能的全面优化;表达式如下:The penalty term, including fatal penalty errors and resource overload penalties, is used to avoid unreasonable sensor selection and excessive resource consumption, thereby guiding the system to learn the optimal sensor selection strategy and achieve comprehensive optimization of system performance; the expression is as follows: (15) (15) (16) (16)综合奖励计算公式如下:The comprehensive reward calculation formula is as follows: (17)。 (17).6.根据权利要求1所述的基于4D高斯泼溅的强化学习自适应多模态SLAM方法,其特征在于:所述的步骤3包括以下步骤,6. The reinforcement learning adaptive multimodal SLAM method based on 4D Gaussian splashing according to claim 1 is characterized in that: the step 3 comprises the following steps:强化学习传感器选择模块根据当前环境条件和系统状态,针对各个参数对传感器模式选择的影响,自适应选择最适合的传感器模式;其中,权重函数表达式如下:The reinforcement learning sensor selection module adaptively selects the most suitable sensor mode according to the current environmental conditions and system status and the influence of various parameters on the sensor mode selection; the weight function expression is as follows:+++++ (21) + + + + + (twenty one)其中,分别表示,激光雷达置信度,动态物体覆盖率,IMU置信度,雷达视觉一致性,算力负载和能源剩余预算高的权重;in, , , , , and Respectively represent the laser radar confidence , dynamic object coverage , IMU confidence , radar vision consistency ,Computing load and energy remaining budget High weight;步骤3.1深度相机-IMU模式;Step 3.1 Depth Camera-IMU Mode;当强化学习传感器选择模块根据当前环境特征,根据权重函数S≥为判断阈值,判断SLAM系统选择深度相机和IMU的传感器组合,以深度相机为主导,IMU为辅助;When the reinforcement learning sensor selection module selects the weight function S≥ , To determine the threshold, the SLAM system selects the sensor combination of depth camera and IMU, with the depth camera as the main and IMU as the auxiliary;步骤3.2激光雷达-深度相机模式;Step 3.2 LiDAR-Depth Camera Mode;当强化学习传感器选择模块根据当前环境特征,根据权重函数为判断阈值,判断SLAM系统选择激光雷达和深度相机的传感器组合,以深度相机为主导,激光雷达为辅助;When the reinforcement learning sensor selection module is based on the current environment characteristics, according to the weight function , To determine the threshold, the SLAM system selects a sensor combination of lidar and depth camera, with the depth camera as the main sensor and the lidar as the auxiliary sensor.步骤3.3激光雷达-IMU-深度相机多传感器均衡模式;Step 3.3 LiDAR-IMU-Depth Camera Multi-Sensor Balance Mode;当强化学习传感器选择模块根据当前环境特征,通过权重函数根据权重函数,判断SLAM系统选择用激光雷达、深度相机和IMU三者组合的传感器模式;When the reinforcement learning sensor selection module selects the current environment characteristics through the weight function according to the weight function , determine the sensor mode that the SLAM system chooses to use, which is a combination of lidar, depth camera, and IMU;步骤3.4强化学习传感器选择模块执行动作并交互环境后,通过奖励函数模块记录历史决策记忆,通过价值网络评估,优化整个策略网络的更新。Step 3.4 After the reinforcement learning sensor selection module executes the action and interacts with the environment, the historical decision memory is recorded through the reward function module, and the update of the entire policy network is optimized through value network evaluation.7.根据权利要求6所述的基于4D高斯泼溅的强化学习自适应多模态SLAM方法,其特征在于:所述的步骤3.1包括,7. The method of reinforcement learning adaptive multimodal SLAM based on 4D Gaussian splashing according to claim 6, characterized in that: the step 3.1 comprises:步骤3.1.1数据输入和关键帧选取;Step 3.1.1 Data input and key frame selection;优先选择包含新场景结构或显著特征增加的帧作为关键帧,提升地图的完整性和效率;Prioritize frames that contain new scene structures or significant features as key frames to improve the integrity and efficiency of the map;步骤3.1.2采用广义ICP跟踪进行点云配准;Step 3.1.2 uses generalized ICP tracking for point cloud registration;选取第帧的RGB图像和深度图像,生成点云,其中每个点,并计算每个点的协方差矩阵;通过GICP算法估计当前帧源点云与目标地图点云之间相对位姿变换Select RGB image of the frame and depth image , generate point cloud , where each point , and calculate the covariance matrix for each point ; Estimate the source point cloud of the current frame through the GICP algorithm With target map point cloud Relative pose transformation ;步骤3.1.3.IMU预积分;Step 3.1.3.IMU pre-integration;融合IMU高频测量值,生成帧间运动预积分量作为GICP的初始猜测,提高跟踪效率,与GICP算法结合,增强系统的鲁棒性和准确性;Fusion of IMU high-frequency measurements to generate inter-frame motion pre-integration as the initial guess of GICP, improves tracking efficiency, and combines with GICP algorithm to enhance system robustness and accuracy;步骤3.1.4更新预积分;Step 3.1.4: Update pre-integration;采用GICP优化结果校正IMU积分因噪声和偏差随时间产生的漂移。The GICP optimization results are used to correct the drift of IMU integral due to noise and bias over time.8.根据权利要求6所述的基于4D高斯泼溅的强化学习自适应多模态SLAM方法,其特征在于:所述的步骤3.2包括以下步骤,8. The reinforcement learning adaptive multimodal SLAM method based on 4D Gaussian splashing according to claim 6 is characterized in that: the step 3.2 comprises the following steps:步骤3.2.1数据输入来自深度相机的图像和来自激光雷达的点云;Step 3.2.1 Data input: images from depth camera and point cloud from LiDAR;使用校准的外部信号进行整合,将时间对齐的LiDAR点云转换为深度图像;表达式如下:The time-aligned LiDAR point cloud is converted into a depth image using the calibrated external signal for integration; the expression is as follows: (38) (38)其中,为激光雷达点云,分别为激光雷达到相机坐标系的旋转矩阵和平移向量,为相机的固有矩阵;in, is the lidar point cloud, and are the rotation matrix and translation vector from the laser radar to the camera coordinate system, is the intrinsic matrix of the camera;步骤3.2.2利用增量误差最小化函数;Step 3.2.2 uses the incremental error minimization function;确保平面和点之间的精确对应,公式如下:To ensure the exact correspondence between the plane and the point, the formula is as follows: (39) (39)其中,代表激光雷达点云中的一个点,是基于从上一时刻到世界坐标系的当前姿态估计经过次迭代后的结果,是最接近的高斯中心,的法向量;是点的权重,是正则化项,用于增强误差函数的稳定性和精度,考虑了法向量之间的方向误差;in, Represents a LiDAR point cloud A point in It is based on the current posture estimation from the previous moment to the world coordinate system. The result after iterations is is closest The Gaussian center of yes The normal vector of Yes The weight of is a regularization term used to enhance the stability and accuracy of the error function, taking into account the directional error between normal vectors; 引入正则化项以增强误差函数的稳定性和精确度,并考虑法线方向的误差;表达式如下: Introducing regularization terms To enhance the stability and accuracy of the error function and take into account the error in the normal direction; the expression is as follows: (40) (40)其中,为当前高斯分布的法线;in, is the normal of the current Gaussian distribution;步骤3.2.3权重函数计算;Step 3.2.3 weight function calculation;权重函数的计算步骤如下:The calculation steps of the weight function are as follows:a.确定局部球形区域内的高斯中心:内找到所有最近的高斯分布中心,其中是球心,是半径;a. Determine the Gaussian center within the local spherical region: Find all the nearest Gaussian distribution centers within The center of the ball. is the radius;b.计算高斯点的密度函数,密度函数通过以下公式计算:b. Calculate the density function of Gaussian points, density function Calculated by the following formula:(41) (41)其中,是重构的协方差矩阵,通过选择沿法线方向的最小方差和垂直方向的较大方差来构建;in, is the reconstructed covariance matrix, by selecting the minimum variance along the normal direction and a larger variance in the vertical direction to build;c.简化密度函数计算,为了加快计算速度,在跟踪过程中简化密度函数的计算:c. Simplify the calculation of density function. In order to speed up the calculation, simplify the calculation of density function during the tracking process: (42) (42)d.一致性计算,对于每个点,计算当前高斯分布的法线与局部平均法线的一致性d. Consistency calculation, for each point , calculate the normal of the current Gaussian distribution With the local mean normal Consistency ;e.复杂纹理度计算,对于每个雷达点对应的图像区域,计算局部纹理复杂度;将雷达点投影至相机的像素坐标系,获取其在图像中的位置,以此为中心截取的图像块,其中=16,将图像块转换为灰度图,计算像素强度的方差:e. Complex texture calculation: For each image area corresponding to a radar point, calculate the local texture complexity; Project to the camera's pixel coordinate system to get its position in the image , intercepted with this as the center image blocks, where =16, convert the image block to grayscale and calculate the variance of pixel intensity: (43) (43)其中,为像素强度,图像块均值;in, is the pixel intensity, Image block mean;通过sigmoid函数将方差转换为0-1的纹理权重:The variance is converted to a 0-1 texture weight through the sigmoid function: (44) (44)其中,为缩放因子,用于调整方差敏感度;in, is the scaling factor used to adjust the variance sensitivity;f.最终权重函数,定义最终的权重函数为法线一致性、密度函数和纹理复杂度三者的乘积,即f. Final weight function, define the final weight function is the product of normal consistency, density function and texture complexity, that is, .9.根据权利要求6所述的基于4D高斯泼溅的强化学习自适应多模态SLAM方法,其特征在于:所述的步骤3.3包括以下步骤,9. The method of reinforcement learning adaptive multimodal SLAM based on 4D Gaussian splashing according to claim 6, characterized in that: the step 3.3 comprises the following steps:步骤3.3.1数据输入与硬件同步;Step 3.3.1 Data input and hardware synchronization;激光雷达提供稀疏但高精度的3D点云,深度相机捕捉场景的RGB纹理,IMU高频输出角速度和加速度,用于运动预测,确保三者的数据时间戳严格对齐,避免时序漂移;The LiDAR provides sparse but high-precision 3D point clouds, the depth camera captures the RGB texture of the scene, and the IMU outputs angular velocity and acceleration at high frequency for motion prediction, ensuring that the data timestamps of the three are strictly aligned to avoid timing drift;步骤3.3.2通过深度相机输入进行关键帧选择;Step 3.3.2: select key frames through depth camera input;从连续数据流中选择具有代表性的帧作为关键帧,减少冗余计算;Select representative frames from continuous data streams as key frames to reduce redundant calculations;步骤3.3.3.IMU数据用于状态传播;Step 3.3.3.IMU data is used for state propagation;根据输入的IMU数据和上一关键帧的状态,通过IMU预积分预测当前状态,用于状态估计进行预测和对运动去畸变进行前向预测;According to the input IMU data and the state of the previous key frame, the current state is predicted by IMU pre-integration, which is used for state estimation and forward prediction of motion distortion removal;步骤3.3.4对激光雷达输入进行去畸变;Step 3.3.4 dedistorts the lidar input;利用IMU预测的连续位姿,将每个激光雷达点从扫描时刻的局部坐标系转换到全局坐标系;表达式如下:Using the continuous pose predicted by IMU, each lidar point Convert from the local coordinate system at the time of scanning to the global coordinate system; the expression is as follows: (45) (45)其中,表示点的采集时间戳,表示通过IMU插值得到的时刻的位姿。in, Indicate point The acquisition timestamp, Indicates the value obtained through IMU interpolation The position of the moment.10.根据权利要求1所述的基于4D高斯泼溅的强化学习自适应多模态SLAM方法,其特征在于:所述的步骤4包括以下步骤,10. The 4D Gaussian splashing-based reinforcement learning adaptive multimodal SLAM method according to claim 1, characterized in that: the step 4 comprises the following steps:步骤4.1滑动窗口维护;Step 4.1 Sliding window maintenance;步骤4.2 4D高斯分布;Step 4.2 4D Gaussian distribution;步骤4.3引入光流来解决4D GS过拟合问题;Step 4.3 introduces optical flow to solve the 4D GS overfitting problem;步骤4.4地图更新和优化;Step 4.4 Map update and optimization;步骤4.5回环检测;Step 4.5 loop detection;通过提取激光雷达和视觉特征并生成特征描述子,模块能够检测潜在的回环候选;By extracting lidar and visual features and generating feature descriptors, the module is able to detect potential loop closure candidates;利用几何验证和一致性检查确认回环假设;Confirm loop closure assumptions using geometric verification and consistency checks;将验证后的回环约束反馈到地图优化过程中,以全局优化地图结构。The verified loop constraints are fed back into the map optimization process to globally optimize the map structure.
CN202510607098.5A2025-05-132025-05-134D Gaussian splatter-based reinforcement learning self-adaptive multi-mode SLAM methodActiveCN120141447B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202510607098.5ACN120141447B (en)2025-05-132025-05-134D Gaussian splatter-based reinforcement learning self-adaptive multi-mode SLAM method

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202510607098.5ACN120141447B (en)2025-05-132025-05-134D Gaussian splatter-based reinforcement learning self-adaptive multi-mode SLAM method

Publications (2)

Publication NumberPublication Date
CN120141447Atrue CN120141447A (en)2025-06-13
CN120141447B CN120141447B (en)2025-08-19

Family

ID=95945516

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202510607098.5AActiveCN120141447B (en)2025-05-132025-05-134D Gaussian splatter-based reinforcement learning self-adaptive multi-mode SLAM method

Country Status (1)

CountryLink
CN (1)CN120141447B (en)

Citations (14)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20190217476A1 (en)*2018-01-122019-07-18Futurewei Technologies, Inc.Robot navigation and object tracking
CN116263335A (en)*2023-02-072023-06-16浙江大学Indoor navigation method based on vision and radar information fusion and reinforcement learning
CN117990088A (en)*2024-02-022024-05-07上海人工智能创新中心Dense visual SLAM method and system using three-dimensional Gaussian back end representation
CN118071873A (en)*2024-03-122024-05-24浙江师范大学Dense Gaussian map reconstruction method and system in dynamic environment
US20240355047A1 (en)*2024-04-302024-10-24Intel CorporationThree dimensional gaussian splatting initialization based on trained neural radiance field representations
CN118840486A (en)*2024-07-052024-10-25三星电子(中国)研发中心Three-dimensional reconstruction method and device of scene, electronic equipment and storage medium
CN119006700A (en)*2024-07-102024-11-22中科(洛阳)机器人与智能装备研究院Complex scene three-dimensional reconstruction method based on multi-source fusion
CN119180908A (en)*2024-08-132024-12-24武汉大学深圳研究院Gaussian splatter-based laser enhanced visual three-dimensional reconstruction method and system
CN119197556A (en)*2024-09-292024-12-27上海电机学院 Enhanced RTK positioning method for blind spot compensation of laser inertial navigation odometer
CN119355714A (en)*2024-10-222025-01-24西南科技大学 A multi-sensor fusion SLAM method suitable for dynamic rain and fog environment
CN119478277A (en)*2025-01-142025-02-18中国人民解放军火箭军工程大学 A dense visual scene reconstruction method and system
CN119828481A (en)*2025-03-132025-04-15北京知达客信息技术有限公司Cockpit multi-scene switching output control method based on artificial intelligence
CN119846644A (en)*2024-11-252025-04-18北京氢源智能科技有限公司Position correction method and device applied to unmanned aerial vehicle laser SLAM system
CN119941985A (en)*2024-12-312025-05-06广州大学 Three-dimensional modeling method and device based on reinforcement learning and multi-sensor data fusion

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20190217476A1 (en)*2018-01-122019-07-18Futurewei Technologies, Inc.Robot navigation and object tracking
CN116263335A (en)*2023-02-072023-06-16浙江大学Indoor navigation method based on vision and radar information fusion and reinforcement learning
CN117990088A (en)*2024-02-022024-05-07上海人工智能创新中心Dense visual SLAM method and system using three-dimensional Gaussian back end representation
CN118071873A (en)*2024-03-122024-05-24浙江师范大学Dense Gaussian map reconstruction method and system in dynamic environment
US20240355047A1 (en)*2024-04-302024-10-24Intel CorporationThree dimensional gaussian splatting initialization based on trained neural radiance field representations
CN118840486A (en)*2024-07-052024-10-25三星电子(中国)研发中心Three-dimensional reconstruction method and device of scene, electronic equipment and storage medium
CN119006700A (en)*2024-07-102024-11-22中科(洛阳)机器人与智能装备研究院Complex scene three-dimensional reconstruction method based on multi-source fusion
CN119180908A (en)*2024-08-132024-12-24武汉大学深圳研究院Gaussian splatter-based laser enhanced visual three-dimensional reconstruction method and system
CN119197556A (en)*2024-09-292024-12-27上海电机学院 Enhanced RTK positioning method for blind spot compensation of laser inertial navigation odometer
CN119355714A (en)*2024-10-222025-01-24西南科技大学 A multi-sensor fusion SLAM method suitable for dynamic rain and fog environment
CN119846644A (en)*2024-11-252025-04-18北京氢源智能科技有限公司Position correction method and device applied to unmanned aerial vehicle laser SLAM system
CN119941985A (en)*2024-12-312025-05-06广州大学 Three-dimensional modeling method and device based on reinforcement learning and multi-sensor data fusion
CN119478277A (en)*2025-01-142025-02-18中国人民解放军火箭军工程大学 A dense visual scene reconstruction method and system
CN119828481A (en)*2025-03-132025-04-15北京知达客信息技术有限公司Cockpit multi-scene switching output control method based on artificial intelligence

Also Published As

Publication numberPublication date
CN120141447B (en)2025-08-19

Similar Documents

PublicationPublication DateTitle
CN111798475B (en)Indoor environment 3D semantic map construction method based on point cloud deep learning
US9990736B2 (en)Robust anytime tracking combining 3D shape, color, and motion with annealed dynamic histograms
CN119339008B (en) Dynamic modeling method of scene space 3D model based on multimodal data
CN111325794A (en) A Visual Simultaneous Localization and Map Construction Method Based on Deep Convolutional Autoencoders
CN117152228A (en) Self-supervised image depth estimation method based on channel self-attention mechanism
CN117789082A (en)Target entity track prediction method based on space-ground fusion
Wang et al.Structerf-SLAM: Neural implicit representation SLAM for structural environments
CN116824433A (en)Visual-inertial navigation-radar fusion self-positioning method based on self-supervision neural network
CN119828161A (en)Mobile robot positioning and navigation method based on laser SLAM
CN114202579B (en)Dynamic scene-oriented real-time multi-body SLAM system
CN120141447B (en)4D Gaussian splatter-based reinforcement learning self-adaptive multi-mode SLAM method
CN118397368A (en)Data value mining method based on automatic driving world model in data closed loop
CN112344936A (en)Semantic SLAM-based mobile robot automatic navigation and target recognition algorithm
CN115797400A (en)Multi-unmanned system collaborative long-term target tracking method
Mohanty et al.Self-Supervised Tight Coupling of GNSS with Neural Radiance Fields for UAV Navigation
Abdein et al.Self-supervised uncertainty-guided refinement for robust joint optical flow and depth estimation
CN120236095B (en)Distributed optical fiber sensing mass target data screening method based on image processing
CN119618208B (en)SLAM method based on fusion of end-to-end visual odometer and aerial view multisensor
CN120374963B (en)Camera three-dimensional target detection method and system based on multi-mode distillation
CN119723517B (en) Six-degree-of-freedom prediction method and system for unmanned ships based on visual analysis
CN119820559B (en) A visual positioning method based on self-learning
CN120070492A (en)Three-dimensional scene flow estimation two-way learning network based on point aggregation
Han et al.Multi-Scale Spatiotemporal Transformer Networks for Trajectory Prediction of Autonomous Driving
CN120431176A (en)Occupancy prediction method of super-resolution image reconstruction and space-time self-adaption
CN116430478A (en) Method, system, equipment and medium for extrapolation and prediction of radar chart

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp