CN105953796A

Movatterモバイル変換

Info

Publication number: CN105953796A
Application number: CN201610346493.3A
Authority: CN
Inventors: 邓欢军; 方维; 李�根; 乔羽; 古鉴
Original assignee: Beijing Storm Mirror Technology Co Ltd
Current assignee: Beijing Storm Mirror Technology Co Ltd
Priority date: 2016-05-23
Filing date: 2016-05-23
Publication date: 2016-09-21

Abstract

The invention discloses a stable motion tracking method and a stable motion tracking device based on integration of a simple camera and an IMU (inertial measurement unit) of a smart cellphone, and belongs to the technical field of AR (augmented reality)/VR (virtual reality) motion tracking. The method includes processing an acquired image according to an ORB (object request broker) algorithm, performing 3D (three-dimensional) reconstruction to obtain initial map points, and completing map initialization; performing visual tracking through ORB algorithm real-time matching and parallel partial keyframe mapping to obtain a visual pose; acquiring accelerated velocity and angular velocity, both generated in a three-dimensional space, of the IMU, and performing integral operation on the accelerated velocity and the angular velocity to obtain an IMU pose prediction result; performing Kalman fusion on the visual pose and the IMU pose prediction result, and performing motion tracking according to pose information acquired after fusion. Compared with the prior art, the stable motion tracking method and the stable motion tracking device have the advantages that a stable motion tracking mode can be acquired and real-time online dimension estimation can be achieved.

Description

Smart phone monocular and IMU fused stable motion tracking method and device

Technical Field

The invention relates to the field of mobile communication, in particular to a stable motion tracking method and device integrating a smart phone monocular and an IMU.

Background

With the development of VR technology, the utilization of advanced motion tracking technology is one of the prerequisites for its application, on the basis of which better interaction and better immersion can be achieved. The current mobile VR mainly uses a handle for interaction, only uses a gyroscope of a mobile phone for rotation tracking in the interaction process, and due to the influence of the self deviation and noise of the gyroscope of the mobile phone, the rotation estimation is inaccurate, and the repetition precision is poor; when a user sits and stands up to move forwards, if the handle is not used for interaction, the virtual scene is kept fixed, and the interaction experience is poor like nothing happens; when the user sits and immerses in the virtual environment, the subconscious stands up and tries to move, the virtual scene does not change, and the immersion sense disappears.

The motion tracking technology is to measure, track and record the motion track of an object in a three-dimensional space, mainly obtains the information of a motion scene through a sensor technology, calculates the attitude of the tracked object in the space in real time, and is widely applied to the fields of robot navigation, unmanned aerial vehicle navigation, unmanned vehicle automatic driving navigation and the like. Since the concept of Visual Odometry (VO) was first proposed by Nister in 2004, a method based on Visual Odometry has become the mainstream of real-time pose estimation and motion tracking. It determines the motion trajectory of the camera in time space by estimating the incremental motion of the camera in space. And a Visual Inertial Odometer (VIO) integrates information of a camera and an inertial sensor, mainly a gyroscope and an accelerometer, and provides a scheme with complementary advantages. For example, a single camera can estimate relative position, but it cannot provide absolute scale, cannot get the size of an object or the actual distance between two objects, and the camera sampling frame rate is generally low and the noise of the image sensor is relatively large, making it less adaptive to the environment during motion tracking. Inertial sensors can provide absolute dimensions and measure at a higher sampling frequency, thereby increasing robustness when the device is moving rapidly. However, the self-contained low-cost inertial sensor is prone to larger drift than the camera-based position estimation, and cannot achieve stable motion tracking.

Disclosure of Invention

The invention aims to provide a stable motion tracking method and a stable motion tracking device integrating a smart phone monocular and an IMU, which can acquire a more stable motion tracking mode and realize real-time online scale estimation.

In order to solve the technical problems, the invention provides the following technical scheme:

a stable motion tracking method fusing a single eye and an IMU of a smart phone comprises the following steps:

processing the acquired image by using an ORB algorithm, and then performing 3D reconstruction to obtain an initial map point and complete map initialization;

performing visual tracking by using an ORB algorithm in a real-time matching and parallel local key frame mapping mode to obtain a visual pose;

acquiring acceleration and angular velocity values generated by the IMU in a three-dimensional space, and performing integral operation on the acceleration and angular velocity values to obtain an IMU pose prediction result;

and performing Kalman fusion on the prediction results of the visual pose and the IMU pose, and performing motion tracking according to pose information obtained after fusion.

Further, the processing the acquired image by using the ORB algorithm, and then performing 3D reconstruction to obtain an initial map point, and completing map initialization includes:

extracting feature points of the acquired first frame image by using an ORB algorithm, calculating a descriptor, recording the first frame as a key frame, and marking the absolute pose of the camera;

after the camera translates one end distance, extracting feature points and calculating a descriptor by adopting an ORB algorithm on the acquired image, matching the feature points with the feature points of the first frame image, recording a second frame as a key frame, and calculating the relative pose of the camera under the second frame relative to the first frame;

and 3D reconstruction is carried out on the successfully matched feature point set to obtain an initial map point.

Further, the calculating the relative pose of the phase with respect to the first frame under the second frame includes:

calculating a basic matrix between the two frames of images according to the corresponding matched feature point sets on the first frame of image and the second frame of image;

calculating to obtain an essential matrix according to the basic matrix and the internal parameters of the camera;

and carrying out singular value decomposition on the essential matrix to obtain the relative pose of the phase under the second frame relative to the first frame.

Further, the performing the visual tracking by using the ORB algorithm to perform real-time matching and parallel local key frame mapping, and obtaining the visual pose includes:

rasterizing a current frame of the image by adopting an ORB algorithm to extract image feature points and a calculation descriptor;

estimating the corresponding camera pose of the current frame by adopting a constant speed motion model, projecting all map points of the previous frame of image onto the current image frame, matching the feature points, and assigning the successfully matched map points of the previous frame to the corresponding feature points of the current frame;

updating the pose and the map point of the current frame by adopting an LM algorithm and Huber estimation;

and projecting all map points of the local key frame onto the current image frame according to the updated pose, matching the feature points, assigning all the map points successfully matched to the corresponding feature points of the current frame after the matching is successful, and updating the pose of the current frame and the map points of the current frame again by using an LM algorithm and Huber estimation.

Further, the performing the visual tracking by using the ORB algorithm to perform real-time matching and parallel local key frame mapping, and obtaining the visual pose further includes:

judging whether a key frame needs to be added or not according to the time interval condition and/or the number of map points of the current frame, and if the time is more than a certain time from the last time of adding the key frame or the number of the map points of the current frame is less than a threshold value, adding a new key frame;

judging whether the current frame is a new key frame, if so, adding new map points, performing feature point matching on all feature points of the new key frame without the map points and all feature points in the local key frame, and performing 3D reconstruction after successful matching to obtain new map points;

and (4) optimizing local beam adjustment, correcting accumulated errors, and obtaining an optimized pose and a map point.

A smart phone monocular and IMU fused stable motion tracking device comprising:

the map initialization module is used for processing the acquired image by utilizing an ORB algorithm and then performing 3D reconstruction to obtain an initial map point and finish map initialization;

the visual tracking module is used for carrying out visual tracking in a mode of matching in real time and parallel local key frame mapping by using an ORB algorithm to obtain a visual pose;

IMU position appearance calculation module: the IMU pose prediction method comprises the steps of obtaining acceleration and angular velocity values generated by the IMU in a three-dimensional space, and performing integral operation on the acceleration and angular velocity values to obtain an IMU pose prediction result;

a fusion module: the method is used for performing Kalman fusion on the prediction results of the visual pose and the IMU pose and performing motion tracking according to pose information obtained after fusion.

Further, the map initialization module is further configured to:

Further, the visual tracking module is further configured to:

The invention has the following beneficial effects:

in the invention, a map is initialized, and after the map is initialized successfully, images are obtained for continuous tracking and pose estimation; meanwhile, IMU data are obtained to carry out integral prediction pose; and performing data fusion under an Extended Kalman Filter (EKF) frame to obtain stable pose estimation. Aiming at the motion tracking problem of the current mobile VR, the invention can accurately estimate the pose and the absolute scale by combining visual measurement and inertial sensor measurement under an EKF frame by using a VIO with a camera and an IMU which are carried by a mobile terminal. A fast and stable motion tracking method of moving VR is achieved. Compared with the prior art, the method has the characteristics of acquiring a more stable motion tracking mode and realizing real-time online estimation of the scale.

Drawings

FIG. 1 is a schematic flow chart of a smartphone monocular and IMU fused stable motion tracking method of the present invention;

FIG. 2 is a schematic view of a visual pose estimation flow of a smartphone monocular and IMU fused stable motion tracking method of the present invention;

FIG. 3 is a schematic diagram of a visual pose and IMU pose Kalman fusion principle of the smartphone monocular and IMU fusion stabilization motion tracking method of the present invention;

FIG. 4 is a schematic diagram of a coordinate system of a smartphone monocular and IMU fused stable motion tracking method of the present invention;

FIG. 5 is a schematic diagram of a monocular vision and IMU system of the smartphone monocular and IMU fused stable motion tracking method of the present invention;

fig. 6 is a general flowchart of a technical solution of the smartphone monocular and IMU fused stable motion tracking method of the present invention;

fig. 7 is a schematic structural diagram of a smartphone monocular and IMU integrated stable motion tracking device according to the present invention.

Detailed Description

In order to make the technical problems, technical solutions and advantages of the present invention more apparent, the following detailed description is given with reference to the accompanying drawings and specific embodiments.

In one aspect, the present invention provides a method for tracking stable motion by fusing a smartphone monocular and an IMU, as shown in fig. 1, including:

step S101: processing the acquired image by using an ORB algorithm, and then performing 3D reconstruction to obtain an initial map point and complete map initialization;

in this step, the purpose of map initialization is to construct an initial three-dimensional point cloud. Since depth information cannot be obtained from only a single frame, it is necessary to select two or more frames of images from an image sequence, estimate a camera pose, and reconstruct an initial three-dimensional point cloud. In this step, two key frames are used, one is an initial key frame (initial frame) and the other is a key frame (end frame) which moves for a certain angle, matching of key points is performed between the initial frame and the end frame, then 3D reconstruction is performed on a feature point set which is successfully matched, and finally map initialization is completed.

Step S102: performing visual tracking by using an ORB algorithm in a real-time matching and parallel local key frame mapping mode to obtain a visual pose;

in this step, after the map is initialized successfully, the movement tracking is performed based on the vision. And (4) considering the weak computing power of the mobile terminal, performing visual tracking by using real-time matching and pose estimation of an ORB algorithm and a parallel local key frame maintenance and mapping mode to obtain a visual pose. The ORB algorithm is used for real-time matching and pose estimation as a tracking thread, and the maintenance and mapping of local key frames are local joint frame threads.

Step S103: acquiring acceleration and angular velocity values generated by the IMU in a three-dimensional space, and performing integral operation on the acceleration and angular velocity values to obtain an IMU pose prediction result;

in this step, the involved IMU (Inertial measurement unit, IMU for short) is a device for measuring the three-axis attitude angular velocity (or angular velocity) and acceleration of an object. Generally, an IMU includes three single-axis accelerometers and three single-axis gyroscopes, the accelerometers are used for detecting acceleration signals of an object in three independent axes of a carrier coordinate system, and the gyroscopes are used for detecting angular velocity signals of a carrier relative to a navigation coordinate system. IMU data is acquired between the front and the back adjacent frames for pose prediction, and the vision pose estimation of the next frame is used as a measurement value for updating.

Step S104: performing Kalman fusion on the prediction results of the visual pose and the IMU pose, and performing motion tracking according to pose information obtained after fusion;

in the step, in order to acquire a stable tracking pose and fully utilize information obtained by sensors of the vision and IMU, the invention fuses the vision pose obtained by a vision image and a pose prediction result obtained by IMU integration by using a Kalman fusion method so as to realize information complementation and target state estimation of two heterogeneous sensors, thereby acquiring a more accurate and reliable pose after fusion. And then, carrying out motion tracking according to the fused pose information.

As an improvement of the present invention, processing the acquired image by utilizing the ORB algorithm, and then performing 3D reconstruction to obtain an initial map point, and completing map initialization includes:

In view of this improvement, the present invention provides a complete and specific embodiment as follows:

1. acquiring a first frame image, extracting feature points by using an Oriented fast computed conditional Brief (ORB) with local invariance, and calculating a descriptor, wherein the first frame is a key frame, and the absolute pose of a marking camera is [ R [ ]_(0,k)|t_(0,k)]The subscript (0, k) denotes the absolute pose of the kth frame, then [ R_(0,0)|t_(0,0)]＝[I|0]；

2. After the image is translated for a certain distance, the image is collected again, feature points are extracted by using an ORB algorithm, and descriptors are calculated. And after the matching with the image feature points of the first frame is successful, marking the frame as a key frame. And calculating the relative pose of the phase under the second frame relative to the first frame as [ R ]_(0,1)|t_(0,1)]＝[R|t]；

3. And 3D reconstruction is carried out on the successfully matched feature point set to obtain an initial map point.

In the embodiment, an ORB algorithm is adopted to extract features and directly match and estimate the pose, the ORB algorithm is an algorithm improvement combining FAST corner detection and BRIEF feature description, and the efficiency and the precision in the monocular vision tracking process are considered.

As a further improvement of the present invention, calculating the relative pose of the phase under the second frame with respect to the first frame includes:

and performing singular value decomposition on the essential matrix to obtain the relative pose of the phase machine under the second frame relative to the first frame.

For further improvement of the invention, the invention provides the following complete specific examples:

① translating for a certain distance, extracting feature points and calculating descriptors for the second frame image by using ORB algorithm, matching with the feature points of the first frame image successfully to obtain corresponding matched feature point sets (X) on the two key frames_L,X_R)；

② according to X_L^TFX_RCalculating a basic matrix F when the value is 0;

③ is formed by the correlation between the basic matrix F and the essential matrix E_L^TFK_RWherein (K)_L,K_R) Intrinsic parameters of respective cameras, which can be calibrated in advance and K_L＝K_R. Obtaining an essential matrix E, wherein the essential matrix is only related to the external parameters of the camera and is not related to the internal parameters of the camera;

④ according to E ═ t]_×R, wherein [ t ]]_×Is given by (t) the translation amount t_x,t_y,t_z)^TR is a rotation matrix. R and t can be calculated by using Singular Value Decomposition (SVD) on the matrix E, and the relative pose of the phase machine in the second frame relative to the first frame is [ R [ ]_(0,1)|t_(0,1)]＝[R|t]。

In the embodiment, in the moving process of the camera, a series of relative poses corresponding to each frame of picture can be acquired in sequence.

As a further improvement of the invention, the method for performing visual tracking by using an ORB algorithm to match in real time and parallel local key frame mapping comprises the following steps:

In accordance with a further development of the invention described above, as shown in fig. 2, for the current frame (ith) of the picture_kFrame image), a specific example of the tracking step is as follows:

(1) the method comprises the steps that an ORB algorithm is used for rasterizing (an image is equally divided into a series of grids with the same size) areas to extract image feature points and calculation descriptors, and the rasterization extraction can ensure that the feature points on the image are uniformly extracted and distributed, so that the stability and the precision of subsequent tracking are improved;

(2) and estimating the corresponding camera pose of the current frame by adopting a constant speed motion model. The last frame image I_k-1All map points are projected onto the current image frame. Matching the feature points, and assigning the map point of the last frame successfully matched to the corresponding feature point of the current frame;

(3) updating the pose and the map point of the current frame by using an LM (Levenberg-Marquardt) algorithm and a Huber estimation;

(4) according to the updated pose, all map points of the local key frame (the map points do not include the map points in (2))Map points) are projected onto the current image frame and feature point matching is performed. And after the matching is successful, assigning all map points successfully matched to the corresponding feature points of the current frame. And re-updates the current frame pose [ R ] using the LM algorithm and the Huber estimate_(0,k)|t_(0,k)]And a current frame map point.

In the embodiment, the ORB algorithm is used for carrying out visual tracking in a real-time matching and parallel local key frame mapping mode, so that the visual pose is obtained. The ORB algorithm is used for real-time matching and pose estimation as a tracking thread, and the maintenance and mapping of local key frames are local joint frame threads. In the embodiment, the tracking thread and the local key frame thread are processed in parallel, so that the real-time tracking is realized with high efficiency.

As an improvement of the present invention, the method of performing the visual tracking by using the ORB algorithm to perform real-time matching and parallel local key frame mapping, and obtaining the visual pose may further include:

For such improvement, the present invention provides the following complete specific embodiments:

1) and adding a new key frame, and judging whether the key frame needs to be enhanced or not according to the time dimension and the number of map points of the current frame. When the time is longer than a certain time from the last time of adding the key frame or the number of map points of the current frame is less than a threshold value, adding a new key frame;

2) and if the current frame is a new key frame, adding a new map point. Carrying out feature point matching on all feature points of the new key frame without map points and all feature points in the local key frame, and obtaining new map points through 3D reconstruction after successful matching;

3) in order to ensure the tracking efficiency and the tracking continuity, the number of local key frames is controlled, and when the number of the key frames is greater than a threshold value, the key frame which is added into the local key frames at the earliest is deleted;

4) and (4) optimizing local beam Adjustment (Bundle Adjustment) and correcting accumulated errors. And obtaining the optimized pose and map points.

In this embodiment, steps 1) to 4) may be placed in the local key frame thread (the local key frame thread is (4) in the above embodiment) for parallel processing, so as to improve efficiency. Repeating (1) to (4), and 1) to 4) in the above embodiment enables continuous tracking.

In the embodiment, the tracking continuity can be ensured, the number of key frames needing to be processed can be reduced, the processing time is reduced, and the motion tracking efficiency is improved.

In the present invention, the kalman fusion process performed by the visual pose and the IMU pose may be implemented by various methods known to those skilled in the art, and preferably, the kalman fusion process may be performed with reference to the following embodiments:

for convenience of description, the subscripts w, i, v, c are defined to respectively represent a world coordinate system, an IMU coordinate system, a visual coordinate system, and a camera coordinate system, as shown in fig. 3. Coordinate system definition, as shown in FIG. 4;

step 1: assuming that the inertial measurement includes a specific bias b and white gaussian noise n, the actual angular velocity ω and the actual acceleration a are as follows:

ω＝ω_m-b_ω-n_ωa＝a_m-b_a-n_a

where the subscript m denotes the measured value, the dynamic deviation b can be expressed as a random process:

{\overset{\cdot}{b}}_{ω} = n_{b ω} {\overset{\cdot}{b}}_{a} = n_{b a}

the state of the filter includes the position of the IMU in the world coordinate systemAnd the speed of the world coordinate system relative to the IMU coordinate systemAnd attitude four-elementAt the same time, there is also the gyro and accelerometer bias b_ω，b_aAnd a visual scale factor λ. And calibrating the rotational relationship between the obtained IMU and the cameraTranslation relationA state vector X comprising 24 elements can thus be obtained, as shown in the prediction module of fig. 5.

X = {\begin{matrix} {p_{w}^{i}}^{T} & {v_{w}^{i}}^{T} & {q_{w}^{i}}^{T} & {b_{ω}}^{T} & {b_{a}}^{T} & λ & p_{i}^{c} & q_{i}^{c} \end{matrix}}

Step 2: in the state expression description above, we describe the gesture using four elements. In this case, we use a four element error to represent the error and its covariance, which can increase numerical stability and be expressed at a minimum. Therefore, we define an error state vector of 22 elements.

\tilde{x} = {\begin{matrix} {Δp}_{w}^{i}^{T} & {Δv}_{w}^{i}^{T} & {δθ}_{w}^{i}^{T} & {Δb}_{ω}^{T} & {Δb}_{a}^{T} & Δ λ & {Δp}_{i}^{c}^{T} & {δθ}_{i}^{c}^{T} \end{matrix}}

Taking into account the estimated valuesAnd its true value x, e.g.We use this method for all state variables except for the four-element error, which is defined as:

{δq}_{w}^{i} = q_{w}^{i} &CircleTimes; {\hat{q}}_{w}^{i}^{- 1} \approx {[\begin{matrix} \frac{1}{2} {δθ}_{w}^{i}^{T} & 1 \end{matrix}]}^{T}, {δq}_{i}^{c} = q_{i}^{c} &CircleTimes; {\hat{q}}_{i}^{c}^{- 1} \approx {[\begin{matrix} \frac{1}{2} {δθ}_{i}^{c}^{T} & 1 \end{matrix}]}^{T}

from this, a linearized equation for the continuous error state can be obtained:

\overset{\cdot}{\tilde{x}} = F_{c} \tilde{x} + G_{c} n

wherein,is a noise vector. In the current solution we are particularly concerned with the speed of the algorithm, for which we assume F during the integration time of two adjacent states_cAnd G_cIs a constant value. To discretize this representation:

F_{d} = \exp (F_{c} Δ t) = I_{d} + F_{c} Δ t + \frac{1}{2} F_{c}^{2} {Δt}^{2} + ...

meanwhile, a covariance matrix Q of discrete time can be obtained through integration_d：

Q_{d} = {&Integral;}_{Δ t} F_{d} (τ) G_{c} Q_{c} G_{C}^{T} F_{d} {(τ)}^{T} d τ

F obtained by calculation_dAnd Q_dAnd according to Kalman filtering, calculating to obtain a state covariance matrix:

P_k+1|k＝F_dP_k|kF_d^T+Q_d

and step 3: position measurement for cameraWe obtain pose [ R ] from single camera's visual tracking_(0,k)|t_(0,k)]，(And) Position vector and rotation quaternion description of the camera pose. And then the corresponding measuring position is obtained. The following measurement models were obtained:

z_{p} = p_{v}^{c} = C_{(q_{v}^{w})}^{T} (p_{w}^{i} + C_{(q_{w}^{i})}^{T} p_{i}^{c}) λ + n_{p}

wherein,is the pose of the IMU in the world coordinate system,is the rotation of the visual coordinate system relative to the world coordinate system.

And 4, step 4: defining a position measurement error model

{\tilde{z}}_{p} = z_{p} - {\hat{z}}_{p} = C_{(q_{v}^{w})}^{T} (p_{w}^{i} + C_{(q_{w}^{i})}^{T} p_{i}^{c}) λ + n_{p} - C_{(q_{v}^{w})}^{T} ({\hat{p}}_{w}^{i} + C_{({\hat{q}}_{w}^{i})}^{T} {\hat{p}}_{i}^{c}) \hat{λ}

Defining a rotation measurement error model

\begin{matrix} {\tilde{z}}_{q} = z_{q} - {\hat{z}}_{q} = q_{i}^{c} &CircleTimes; q_{w}^{i} &CircleTimes; q_{v}^{w} &CircleTimes; {(q_{i}^{c} &CircleTimes; q_{w}^{i} &CircleTimes; q_{v}^{w})}^{- 1} \\ = {δq}_{i}^{c} &CircleTimes; {\hat{q}}_{i}^{c} &CircleTimes; {δq}_{w}^{i} &CircleTimes; {q_{i}^{c}}^{- 1} \\ = H_{q}^{w i} {δq}_{w}^{i} = H_{q}^{i c} {δq}_{i}^{c} \end{matrix}

Wherein,andare respectively error state quantitiesAndthe error measurement matrix of (2). Finally, the measurement matrix may be accumulated as:

[\begin{matrix} {\tilde{z}}_{p} \\ {\tilde{z}}_{q} \end{matrix}] = [\begin{matrix} H_{p} \\ \begin{matrix} 0_{3 \times 6} & {\tilde{H}}_{q}^{w i} & 0_{3 \times 10} & {\tilde{H}}_{q}^{i c} \end{matrix} \end{matrix}] \tilde{x}

and 5: when we acquire the measurement matrix H, we can update according to the steps of the kalman filter, as shown by the update block in fig. 5.

Calculating a residual vector:

calculating a new tracking quantity: s ═ HPH^T+R；

Calculating Kalman gain K-PH^TS^-1；

And (3) calculating correction amount:according to the correction amountWe can calculate the update amount of the X state. The error state four elements can be updated as follows:

P_k+1|k+1＝(I_d-KH)P_k+1|k(I_d-KH)^T+KRK^T

through the monocular tracking and the IMU fusion, stable attitude output of a mobile terminal is obtained, and stable motion tracking is further realized.

The above embodiment is only an example of kalman fusion performed by the visual pose and the IMU pose of the present invention, and other methods known to those skilled in the art may be adopted in addition to this embodiment to achieve the technical effects of the present invention.

In the embodiments of the methods of the present invention, the sequence numbers of the steps are not used to limit the sequence of the steps, and for those skilled in the art, the sequence of the steps is not changed without creative efforts.

On the other hand, corresponding to the above method, the present invention further provides a stable motion tracking apparatus with a smart phone monocular and IMU integrated, as shown in fig. 7, including:

the map initialization module 11 is configured to process the acquired image by using an ORB algorithm, and then perform 3D reconstruction to obtain an initial map point and complete map initialization;

the visual tracking module 12 is used for performing visual tracking in a mode of matching in real time and parallel local key frame mapping by using an ORB algorithm to obtain a visual pose;

the IMU pose calculation module 13: the IMU pose prediction method comprises the steps of obtaining acceleration and angular velocity values generated by the IMU in a three-dimensional space, and performing integral operation on the acceleration and angular velocity values to obtain an IMU pose prediction result;

the fusion module 14: the method is used for performing Kalman fusion on the prediction results of the visual pose and the IMU pose and performing motion tracking according to pose information obtained after fusion.

Compared with the prior art, the method has the characteristics of acquiring a more stable motion tracking mode and realizing real-time online estimation of the scale.

As an improvement of the present invention, the map initialization module 11 is further configured to:

In the invention, an ORB algorithm is adopted to extract features and directly match and estimate the pose, the ORB algorithm is an algorithm improvement combining FAST corner detection and BRIEF feature description, and the efficiency and the precision in the monocular vision tracking process are considered.

According to the invention, a series of relative poses corresponding to each frame of picture can be sequentially obtained in the moving process of the camera.

As a further improvement of the present invention, the visual tracking module 12 is further configured to:

In the invention, visual tracking is carried out by using an ORB algorithm in a real-time matching and parallel local key frame mapping mode, so as to obtain a visual pose. The ORB algorithm is used for real-time matching and pose estimation as a tracking thread, and the maintenance and mapping of local key frames are local joint frame threads. In the invention, the tracking thread and the local key frame thread are processed in parallel, so that the real-time tracking is realized with high efficiency.

As an improvement of the present invention, the visual tracking module 12 is further configured to:

The invention can ensure the tracking continuity, reduce the number of key frames to be processed, reduce the processing time and improve the motion tracking efficiency.

While the foregoing is directed to the preferred embodiment of the present invention, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention as defined in the appended claims.

Claims

1. A stable motion tracking method fusing a smart phone monocular and an IMU (inertial measurement Unit) is characterized by comprising the following steps:

2. The method for tracking stable movement of the fusion of the monocular and IMU of the smartphone according to claim 1, wherein the processing of the acquired image by the ORB algorithm, and then the 3D reconstruction are performed to obtain an initial map point, and the completion of the map initialization includes:

3. The smartphone monocular and IMU fused stable motion tracking method of claim 2, wherein the calculating the relative pose of the second frame with respect to the first frame comprises:

4. The method for tracking the stable movement of the fusion of the monocular and IMU of the smartphone according to any one of claims 1-3, wherein the performing the visual tracking by using the ORB algorithm to match in real time and to construct the map by using the local keyframes, and the obtaining the visual pose comprises:

5. The smartphone monocular and IMU fused stable motion tracking method of claim 4, wherein the performing visual tracking using the ORB algorithm in a real-time matching and parallel local keyframe mapping manner to obtain the visual pose further comprises:

6. The utility model provides a stable motion tracking means of smart mobile phone monocular and IMU integration which characterized in that includes:

7. The smartphone monocular and IMU fused stable motion tracking device of claim 6, wherein the map initialization module is further configured to:

8. The smartphone monocular and IMU fused stable motion tracking device of claim 7, wherein the calculating the relative pose of the second frame with respect to the first frame comprises:

9. The smartphone monocular and IMU fused steady motion tracking device of any one of claims 6-8, wherein the visual tracking module is further configured to:

10. The smartphone monocular and IMU fused steady motion tracking device of claim 9, wherein the visual tracking module is further configured to: