Movatterモバイル変換


[0]ホーム

URL:


CN105989586A - SLAM method based on semantic bundle adjustment method - Google Patents

SLAM method based on semantic bundle adjustment method
Download PDF

Info

Publication number
CN105989586A
CN105989586ACN201510050181.3ACN201510050181ACN105989586ACN 105989586 ACN105989586 ACN 105989586ACN 201510050181 ACN201510050181 ACN 201510050181ACN 105989586 ACN105989586 ACN 105989586A
Authority
CN
China
Prior art keywords
edge
semantic
slam
graph
frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510050181.3A
Other languages
Chinese (zh)
Inventor
廖鸿宇
孙放
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Thunderous Yun He Intellectual Technology Co Ltd
Original Assignee
Beijing Thunderous Yun He Intellectual Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Thunderous Yun He Intellectual Technology Co LtdfiledCriticalBeijing Thunderous Yun He Intellectual Technology Co Ltd
Priority to CN201510050181.3ApriorityCriticalpatent/CN105989586A/en
Publication of CN105989586ApublicationCriticalpatent/CN105989586A/en
Pendinglegal-statusCriticalCurrent

Links

Landscapes

Abstract

The invention provides an SLAM method based on a semantic bundle adjustment method, and belongs to the field of mobile robot simultaneous localization and mapping (SLAM). The method is characterized in that the method combines a 6DOF object and the camera attitude through new semantic global optimization, and can work under a 2D or 3D sensor. As semantic information is added, a target detection channel can be seamlessly integrated into a BA type optimization system of an SLAM system based on BA without peripherals. The method of the invention is simple and easy to implement, and has strong practicability. SLAM constraints can be used in robust target detection, and can adapt to a more complicated environment.

Description

SLAM method based on semantic beam adjustment method
Technical Field
The invention relates to a semantic beam adjustment (SLAM) based method, belonging to the field of synchronous positioning and map creation (SLAM) of a mobile robot.
Background
The visual SLAM (simultaneous localization and mapping) problem involves the ability to gradually reconstruct the map and simultaneously locate the sensing device based only on visual cues. Over the last decade, there has been a remarkable advance in this field with the application of more efficient tools such as AR (augmented reality technology) and machine navigation and composition.
The conventional SLAM problem is solved based on a filtering technique like a Kalman filter, in which visual features are tracked based on frames and their estimated 3D positions and uncertain camera positions. This method can only create sparse maps since only a small fraction of the image pixels are tracked. Alternatively, the visual SLAM problem can be solved by optimizing the BA (beam-balancing) that carries a subset of a series of selected frames. In recent years, various methods based on BA mode have been developed successively. One successful BA model is PTAM (parallel tracking and mapping). This approach presents a way to separate SLAM problems into 2 parallel related tasks: one task is to track the camera according to the currently evaluated landmark position; and the other responsible for managing the global optimization of the selected key frames. Since complexity increases dramatically with the number of features extracted from the environment, PTAMs can only be used in a small space. To overcome this drawback, the scheme is optimized to take into account the entire past tracked key frame, but a small fraction thereof, so that the algorithm reaches a constant time complexity.
Since the tracking and composition tasks have matured to some extent, none of the previous methods are able to seamlessly resolve and extract semantic information during the visual SLAM process. There are some algorithms based on the SLAM framework, most of which are object detection at one view angle, and do not detect multi-view consistency. Some schemes do not attempt to estimate the actual position of the target object despite the precise combination of the detection scheme and the SLAM framework; the rest utilizes geometric information to continuously detect the target, but does not effectively utilize the original detected object to construct a known map; there are also algorithms that use standard feature-based pipelines to detect objects and use estimated relative poses for position representation, which use laser and range data to detect objects on a map, but such algorithms still do not integrate the evaluation of the position of the object with the camera; some algorithms, while using a combined detection and reconstruction approach, either are limited to cars and pedestrians and assume the environment and camera positions, or propose common pixel labeling and dense stereo rendering methods that require calibration of the camera's small baseline; a semantic structure based on motor skills evaluates the problem of camera pose and identifies objects from a series of image sequences, but when the algorithm extends the SFM detector and creates all frames at once, it is necessary to create hypotheses and metrics with an external object detector.
Disclosure of Invention
The invention aims to provide a SLAM method based on a semantic beam adjustment method aiming at the defects of the technology, and the measuring method is used for evaluating the feature consistency of a human input frame and a feature library when low-latitude abstraction is carried out. More importantly, due to the addition of semantic information, the target detection channel can be seamlessly integrated into any BA type optimization system of the BA-based SLAM system without external equipment.
The invention is realized by the following modes: a SLAM method based on a semantic beam adjustment method comprises the following steps:
step 1, determining a series of characteristics of detection targets, and establishing a model database for each detection target.
And 2, extracting description features of a new frame and matching the description features with a model database along with the availability of the new frame, and then creating a determination graph for each given detection target.
Step 3, verifying the determination diagram in the step 2, removing error edges and keeping correct edges; dependent on the global weighted mean difference from a previous global optimizationAs verification threshold in the next target detection process:
Q‾=ΣwijeijTeijΣwij
suppose, the weight wijThe expression of (c) can be divided into all edges;
Step 3.1, for the 2D feature matching scheme, comparing the frame vertex to the mark vertex with a threshold value, and removing the edge if the following formula is met:
||pio,n-Vi[xh(o,n)]||2≥αQ‾
h (o, n) represents the index of the mark vertex related to the nth characteristic point on the ith object, and alpha is a given parameter and ranges from 4 to 9; if the movement from the frame to the mark edge leaves the mark vertex only attached to the object, the edge from the object to the mark needs to be deleted;
step 3.2, for 3D feature matching, comparing the edge from each frame to the object with a threshold value, and removing the edge if the following formula is met:
||qon-xo-1xi[pio,n]||2≥αQ‾
and 4, after the correct semantic edge is determined, whether the determination graph is added into the global graph or not is evaluated according to the threshold value of the edge.
Step 5, checking whether a new frame appears or not, and if no new frame appears, finishing global graph optimization; and if a new frame appears, returning to the step 2, and re-executing the step 2 to the step 5.
The threshold evaluation method of the edge is as follows:
this process can put the deterministic graph in 3 different states, passing through the final number of semantic edges, NseAnd 2 thresholds ηfAnd ηtft) Defining by comparison;
If N is presentsefTargeting false detections, this determination map is deleted and the detection target is removed from the global map;
if ηf≤NsetObject detection is ambiguous, determining that the graph is saved for more visual cues, but the detected object is removed from the global graph;
If N is presentse≥ηtTargets are detected and added to the global map.
The invention has the beneficial effects that:
1. the method of the invention combines 6DOF (6 directional degrees of freedom, i.e. translation in three directions and rotation around three axes) object and camera poses by a new semantic global optimization, which can work under 2D or 3D sensors due to the above features.
2. Because semantic information is added, the target detection channel can be seamlessly integrated into any SLAM system beam adjustment method type optimization system based on the beam adjustment method without external equipment.
3. The method of the invention is simple, easy to realize and has strong practicability, and SLAM constraint can be used for robust target detection and can adapt to more complex environment.
Drawings
FIG. 1 is a flow chart of SLAM method based on semantic beam adjustment method
Detailed Description
The invention is further described below with reference to the accompanying drawings.
The invention is realized by the following modes: a SLAM method based on a semantic beam adjustment method comprises the following steps:
step 1, determining a series of characteristics of detection targets, and establishing a model database for each detection target: if a full 3D model is available, the 3D keypoint detectors and descriptors can be used; otherwise, the model requires the use of a stack of standardized pictures and 2D keypoint detectors and descriptors that provide the required features; the feature descriptors are saved as future matches; then, for the feature positions, the 3D coordinates are saved for the former, and the 2D picture coordinates and their associated view angle poses are saved for the latter.
Step 2, with the availability of a new frame, extracting description characteristics of the new frame and matching the description characteristics with a model database, and then creating a determination graph for each given detection target; for both the model and the frame, 3D keypoints are extracted at 3 different scales by a built-in shape signal detector, and then described by the rotated image; for each scale, creating an index containing descriptors of all models at the scale, and performing a k-neighbor search based on Euclidean distance at the scale; finally, a weighted average difference is calculated.
Step 2.1, if the 2D features match, a new landmark unknown 3D position and a series of edges are created in the cost function to include its projection error, for example, considering a landmark position corresponding to a position x4 and the nth target feature, and the related constraint conditions are as follows:
||q3n-V3[x4]||2+s13,n||p13,n-V1[x4]||2+s23,n||p23,n-V2[x4]||2
wherein,represents the nth 2D feature point learned from the mth object,indicates the ith frame is in possibilityLower matched 2D feature points, Vi[r]Rotate and translate R ∈ R3 to estimate the ith vertex using the current pose.
Step 2.2, when V is described in step 2.1iRelating to an object pose, this picture plane is one of the standard views of the object acquired during training, so the reprojection is constrained by a rigid transformation between the known object reference frame and the view reference frame, in which case the constraints of the formula in step 2.1) can be expressed as follows:
||q3n-V3[x4]||2=(q3n-V3[x4])TI2×2(q3n-V3[x4])
and
s13,n||p13,n-V1[x4]||2=(p13,n-V1[x4])T(s13,nI2×2)(p13,n-V1[x4])
step 2.3, when 3D features are available, connect camera frame and more objects directly, therefore, if m is availableijIs to represent that i feature and j feature match, and known as Pr (m)ik)=sikAnd Pr (m)jk)=sjkAssume mikAnd mjkAre independent and satisfy the following formula:
then Pr (m)ij)=sik*sjkNow, nowAndrepresenting 3D features.
Step 3, verifying the determination diagram in the step 2, removing error edges and keeping correct edges; dependent on the global weighted mean difference from a previous global optimizationAs verification threshold in the next target detection process:
Q‾=ΣwijeijTeijΣwij
suppose, the weight wijThe expression of (c) can be divided into all edges;
step 3.1, for the 2D feature matching scheme, removing edges from the frame vertex to the mark vertex, and satisfying the following formula:
||pio,n-Vi[xh(o,n)]||2≥αQ‾
h (o, n) represents the index of the mark vertex related to the nth characteristic point on the ith object, and alpha is 7; if the frame-to-marker edge movement leaves the marker vertices attached only to the object, the object-to-marker edge needs to be deleted.
Step 3.2, for the 3D feature matching scheme, comparing each frame to the edge of the object with a threshold, the edge will be erased according to the following formula:
||qon-xo-1xi[pio,n]||2≥αQ‾
step 4, after the correct semantic edge is determined, whether the determination graph is added into the global graph or not is evaluated according to the threshold value of the edge: first, the deterministic graph is put in 3 different states, passing the final number N of semantic edgesseAnd 2 thresholds ηfAnd ηtft) And comparing and defining.
If N is presentsefTargeting false detections, this determination map is deleted and the detection target is removed from the global map;
if ηf≤NsetThe target detection is ambiguous, the determination graph is saved for more visual cues, but the detection target is removed from the global graph;
if N is presentse≥ηtTargets are detected and added to the global map.
Of these, 2 thresholds ηfAnd ηtIs not specific because at every frame, the validation of the new hypothesis is constrained by the matching of previously extracted features, and the final evaluation only maintains the best boundary, thus a higher ηfIt is possible to skip matching highly blocked objects in only a few views, with a difference of ηt–ηfIs about the robustness of this verification procedure: a higher number indicates that the consistency of different frames is more likely to be detected, but the least errors are passed to the global map and possibly deleted in the following frames.
Wherein, the global map in step 4 is a semantic global map, including: firstly, obtaining pose vertexes of all cameras with frame-to-frame constraint in a SLAM engine; verifying the posture vertexes of all the objects successfully in the process; and thirdly, all the constraints from frame to mark and object to mark, wherein the constraints are on 2D feature matching or frame to object and virtual frame to frame constraints, and the 3D feature matching is a verification graph from deleted objects.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (2)

CN201510050181.3A2015-03-042015-03-04SLAM method based on semantic bundle adjustment methodPendingCN105989586A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN201510050181.3ACN105989586A (en)2015-03-042015-03-04SLAM method based on semantic bundle adjustment method

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201510050181.3ACN105989586A (en)2015-03-042015-03-04SLAM method based on semantic bundle adjustment method

Publications (1)

Publication NumberPublication Date
CN105989586Atrue CN105989586A (en)2016-10-05

Family

ID=57036891

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201510050181.3APendingCN105989586A (en)2015-03-042015-03-04SLAM method based on semantic bundle adjustment method

Country Status (1)

CountryLink
CN (1)CN105989586A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN107833250A (en)*2017-10-242018-03-23北京易达图灵科技有限公司Semantic space map constructing method and device
CN108230337A (en)*2017-12-312018-06-29厦门大学 A method for implementing a semantic SLAM system based on a mobile terminal
CN108572939A (en)*2018-04-272018-09-25百度在线网络技术(北京)有限公司Optimization method, device, equipment and the computer-readable medium of VI-SLAM
CN109186606A (en)*2018-09-072019-01-11南京理工大学A kind of robot composition and air navigation aid based on SLAM and image information
CN110097584A (en)*2019-03-182019-08-06国网浙江省电力有限公司信息通信分公司The method for registering images of combining target detection and semantic segmentation
WO2019210978A1 (en)2018-05-042019-11-07Huawei Technologies Co., Ltd.Image processing apparatus and method for an advanced driver assistance system
GB2581808A (en)*2019-02-262020-09-02Imperial College Sci Tech & MedicineScene representation using image processing
CN113743413A (en)*2021-07-302021-12-03的卢技术有限公司Visual SLAM method and system combining image semantic information
CN113936085A (en)*2021-12-172022-01-14荣耀终端有限公司Three-dimensional reconstruction method and device

Citations (3)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN103119611A (en)*2010-06-252013-05-22天宝导航有限公司Method and apparatus for image-based positioning
CN103824080A (en)*2014-02-212014-05-28北京化工大学Robot SLAM object state detection method in dynamic sparse environment
US20140320593A1 (en)*2013-04-302014-10-30Qualcomm IncorporatedMonocular visual slam with general and panorama camera movements

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN103119611A (en)*2010-06-252013-05-22天宝导航有限公司Method and apparatus for image-based positioning
US20140320593A1 (en)*2013-04-302014-10-30Qualcomm IncorporatedMonocular visual slam with general and panorama camera movements
CN103824080A (en)*2014-02-212014-05-28北京化工大学Robot SLAM object state detection method in dynamic sparse environment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
NICOLA FIORAIO ET AL.: "Joint Detection, Tracking and Mapping by Semantic Bundle Adjustment", 《2013 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION》*

Cited By (17)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN107833250A (en)*2017-10-242018-03-23北京易达图灵科技有限公司Semantic space map constructing method and device
CN107833250B (en)*2017-10-242020-05-05北京易达图灵科技有限公司Semantic space map construction method and device
CN108230337A (en)*2017-12-312018-06-29厦门大学 A method for implementing a semantic SLAM system based on a mobile terminal
CN108230337B (en)*2017-12-312020-07-03厦门大学Semantic SLAM system implementation method based on mobile terminal
CN108572939A (en)*2018-04-272018-09-25百度在线网络技术(北京)有限公司Optimization method, device, equipment and the computer-readable medium of VI-SLAM
CN108572939B (en)*2018-04-272020-05-08百度在线网络技术(北京)有限公司VI-SLAM optimization method, device, equipment and computer readable medium
WO2019210978A1 (en)2018-05-042019-11-07Huawei Technologies Co., Ltd.Image processing apparatus and method for an advanced driver assistance system
CN109186606B (en)*2018-09-072022-03-08南京理工大学Robot composition and navigation method based on SLAM and image information
CN109186606A (en)*2018-09-072019-01-11南京理工大学A kind of robot composition and air navigation aid based on SLAM and image information
GB2581808A (en)*2019-02-262020-09-02Imperial College Sci Tech & MedicineScene representation using image processing
US12205297B2 (en)2019-02-262025-01-21Imperial College Innovations LimitedScene representation using image processing
GB2581808B (en)*2019-02-262022-08-10Imperial College Innovations LtdScene representation using image processing
CN110097584A (en)*2019-03-182019-08-06国网浙江省电力有限公司信息通信分公司The method for registering images of combining target detection and semantic segmentation
CN110097584B (en)*2019-03-182021-11-09国网浙江省电力有限公司信息通信分公司Image registration method combining target detection and semantic segmentation
CN113743413A (en)*2021-07-302021-12-03的卢技术有限公司Visual SLAM method and system combining image semantic information
CN113743413B (en)*2021-07-302023-12-01的卢技术有限公司Visual SLAM method and system combining image semantic information
CN113936085A (en)*2021-12-172022-01-14荣耀终端有限公司Three-dimensional reconstruction method and device

Similar Documents

PublicationPublication DateTitle
CN105989586A (en)SLAM method based on semantic bundle adjustment method
Carozza et al.Markerless vision‐based augmented reality for urban planning
Reitmayr et al.Going out: robust model-based tracking for outdoor augmented reality
Tardif et al.Monocular visual odometry in urban environments using an omnidirectional camera
US8855442B2 (en)Image registration of multimodal data using 3D-GeoArcs
CN115388902B (en)Indoor positioning method and system, AR indoor positioning navigation method and system
EP3248029A1 (en)Visual localization within lidar maps
CN104268866B (en)The video sequence method for registering being combined with background information based on movable information
Armagan et al.Learning to align semantic segmentation and 2.5 d maps for geolocalization
JP7147753B2 (en) Information processing device, information processing method, and program
Li et al.Image matching techniques for vision-based indoor navigation systems: performance analysis for 3D map based approach
Hallquist et al.Single view pose estimation of mobile devices in urban environments
Yang et al.CubeSLAM: Monocular 3D object detection and SLAM without prior models
Armagan et al.Accurate Camera Registration in Urban Environments Using High-Level Feature Matching.
Tao et al.Automated processing of mobile mapping image sequences
Achar et al.Large scale visual localization in urban environments
Ayadi et al.A skyline-based approach for mobile augmented reality
CN114594485A (en)Apparatus and method for identifying high-rise structures using LiDAR sensors
Sohn et al.Sequential modelling of building rooftops by integrating airborne LiDAR data and optical imagery: preliminary results
CN117893634A (en)Simultaneous positioning and map construction method and related equipment
CN112381939B (en) A visual SLAM method, device, robot and storage medium
Baligh Jahromi et al.Layout slam with model based loop closure for 3d indoor corridor reconstruction
Li et al.Image matching techniques for vision-based indoor navigation systems: A 3D map-based approach1
Clipp et al.Adaptive, real-time visual simultaneous localization and mapping
CN117649619B (en) UAV visual navigation positioning recovery method, system, device and readable storage medium

Legal Events

DateCodeTitleDescription
C06Publication
PB01Publication
C10Entry into substantive examination
SE01Entry into force of request for substantive examination
WD01Invention patent application deemed withdrawn after publication
WD01Invention patent application deemed withdrawn after publication

Application publication date:20161005


[8]ページ先頭

©2009-2025 Movatter.jp