CN109559330B

Movatterモバイル変換

Info

Publication number: CN109559330B
Application number: CN201710872887.7A
Authority: CN
Inventors: 梅元刚; 刘鹏; 陈宇; 王明琛; 朱政
Original assignee: Beijing Kingsoft Cloud Network Technology Co Ltd; Beijing Kingsoft Cloud Technology Co Ltd
Current assignee: Beijing Kingsoft Cloud Network Technology Co Ltd; Beijing Kingsoft Cloud Technology Co Ltd
Priority date: 2017-09-25
Filing date: 2017-09-25
Publication date: 2021-09-10
Anticipated expiration: 2037-09-25
Also published as: WO2019057197A1; CN109559330A

Abstract

Translated fromChinese

本发明实施例提供了一种运动目标视觉跟踪方法、装置、电子设备及存储介质，包括：确定第一视频帧中待跟踪的运动目标，确定运动目标在第一视频帧中的位置信息，并提取运动目标在第一视频帧中的第一特征；获取运动目标在第二视频帧中的加速度信息和角速度信息；计算运动目标在第二视频帧中的位置，并提取第二视频帧中位置处运动目标的第二特征；将第一特征和第二特征进行匹配，得到匹配特征；将匹配特征通过光流算法得到第一特征在第二视频帧中的位置特征信息。本发明实施例可以提高跟踪算法的实时性，提高跟踪效率。

Embodiments of the present invention provide a visual tracking method, device, electronic device and storage medium for a moving target, including: determining a moving target to be tracked in a first video frame, determining position information of the moving target in the first video frame, and Extract the first feature of the moving object in the first video frame; obtain the acceleration information and angular velocity information of the moving object in the second video frame; calculate the position of the moving object in the second video frame, and extract the position in the second video frame The second feature of the moving target is obtained; the first feature and the second feature are matched to obtain the matching feature; the matching feature is used to obtain the position feature information of the first feature in the second video frame through the optical flow algorithm. The embodiments of the present invention can improve the real-time performance of the tracking algorithm and improve the tracking efficiency.

Description

Visual tracking method and device for moving target, electronic equipment and storage medium

Technical Field

The present invention relates to the field of image processing technologies, and in particular, to a visual tracking method and apparatus for a moving target, an electronic device, and a storage medium.

Background

In the field of image processing technology, visual tracking refers to detecting, extracting features, identifying, positioning and filtering a moving target in a video sequence, and obtaining motion parameters of the moving target, such as position, speed, motion trajectory and the like. The visual tracking technology is one of the popular research directions in the field of computer vision, is widely applied to the fields of video monitoring, robot positioning, environmental perception and the like, and can provide necessary technical means for advanced tasks such as behavior understanding, analysis and decision making of a research target.

Visual tracking technology has gained wide attention and research, developed relatively rapidly, and a lot of mature algorithms appeared, which can be roughly divided into three categories: a tracking algorithm based on local information, a tracking algorithm based on detection, a tracking algorithm based on feature points. The tracking algorithm based on local information takes an initial region of a target as a target template, performs template matching on the target template and all regions in an image, and takes a place with the highest matching degree as the position of the target. The commonly used method comprises Lucas-Kanade optical flow tracking algorithm and the like. The tracking algorithm based on the local information adopts the global information of the target, so that the reliability is higher, but when the shielding or deformation of the target is larger, the tracking is easy to fail. The detection-based tracking algorithm reduces the tracking problem to a binary problem of the background and the target. The model is updated in an online Learning mode, and the method can adapt to large changes of targets, such as a classic Tracking-Learning-Detection (TLD) algorithm. The tracking algorithm based on the feature points mainly represents a target as a series of significant features, and obtains the corresponding relation of the target features among video frames through feature matching so as to realize target tracking. The tracking algorithm based on the feature points only considers the significant features of the target, so that the tracking can be realized under the condition of partial occlusion and deformation. The target Tracking (CMT) algorithm based on consistency matching and Tracking feature points is a Tracking algorithm based on feature points, and can track any Object with significant features. The CMT algorithm obtains features by calculating forward and backward light flows between front and back image frames and features obtained by matching feature operators, and adopts a clustering method to carry out screening so as to obtain consistent robust features. In addition, the CMT algorithm calculates the relative positions of the feature points by the centers of the frames, and for an undeformed target, the distances of the features relative to the centers are unchanged under the scaling, so the algorithm can track the rotating target.

The three algorithms in the prior art each have advantages, but at the same time each have disadvantages. When the target is deformed or shielded, the tracking algorithm based on the local information is easy to fail in tracking, and generally needs sliding window matching, so that the calculated amount is large; the tracking algorithm based on detection can track the targets with gradual change by updating the model, but the tracking algorithm based on detection can only give the position information of the targets generally, but can not obtain the attitude information (rotation angle and the like) of the targets; the tracking algorithm based on the feature points can obtain corresponding matched feature points, estimates information such as the position, the posture and the like of a target by a least square method, and can adapt to certain shielding and deformation, but although the CMT algorithm has good tracking performance and high algorithm efficiency, the CMT algorithm cannot completely meet the tracking requirement for mobile terminal equipment requiring calculation efficiency and power consumption, and has higher requirement on the accuracy of the feature points, the feature points extracted in practical application usually have small-range errors, and the target tracking application requiring more stability is difficult to meet, such as enhanced technology (AR) mapping and the like.

Disclosure of Invention

The embodiment of the invention aims to provide a visual tracking method and device of a moving target, electronic equipment and a storage medium, so as to improve the real-time performance and stability of a tracking algorithm. The specific technical scheme is as follows:

the embodiment of the invention provides a visual tracking method for a moving target, which comprises the following steps:

determining a moving target to be tracked in a first video frame, determining position information of the moving target in the first video frame, and extracting a first feature of the moving target in the first video frame;

acquiring acceleration information and angular velocity information of the moving target in a second video frame; wherein the second video frame is a next video frame of the first video frame;

calculating the position of the moving object in the second video frame according to the acceleration information, the angular velocity information and the position information of the moving object in the first video frame, and extracting a second feature of the moving object at the position in the second video frame;

matching the first characteristic with the second characteristic to obtain a matched characteristic;

and obtaining the position feature information of the first feature in the second video frame by the matching feature through an optical flow algorithm.

Specifically, the acquiring acceleration information and angular velocity information of the moving object in the second video frame includes:

acquiring acceleration information of the moving target in a second video frame through an acceleration sensor;

and acquiring the angular speed information of the moving target in the second video frame through a gyroscope sensor.

Specifically, the calculating the position of the moving object in the second video frame according to the acceleration information and the angular velocity information includes:

calculating the position change information of the moving target according to the acceleration information and the angular velocity information;

and determining the position of the moving object in the second video frame according to the position information of the moving object in the first video frame and the position change information.

Specifically, the obtaining, by using an optical flow algorithm, position feature information of the first feature in the second video frame by using the matching feature includes:

fusing the matching feature and the second feature to obtain a common feature of the matching feature and the second feature;

and adopting an optical flow algorithm to the common feature and the first feature to obtain the position feature information of the first feature in the second video frame.

Specifically, the obtaining the position feature information of the first feature in the second video frame by using the optical flow algorithm on the common feature and the first feature includes:

calculating a scale and rotation scale of the common feature relative to the first feature;

and obtaining the position feature information of the first feature in the second video frame by the common feature, the scaling and the rotation scale through a preset tracking algorithm.

Specifically, after the obtaining, by the optical flow algorithm, the position feature information of the first feature in the second video frame by using the matching feature, the method further includes:

voting the common features, the scaling and the rotation scale to generate a voting space;

clustering the voting space;

and counting the length of the clustered voting space.

Specifically, the method further comprises: and when the length of the clustered voting space is greater than a preset threshold value, performing Kalman filtering on the position characteristic information to obtain position information of the moving target in the second video frame.

The embodiment of the invention provides a moving target visual tracking device, which comprises:

the first extraction module is used for determining a moving target to be tracked in a first video frame, determining the position information of the moving target in the first video frame and extracting a first feature of the moving target in the first video frame;

the acquisition module is used for acquiring the acceleration information and the angular velocity information of the moving target in a second video frame; wherein the second video frame is a next video frame of the first video frame;

the second extraction module is used for calculating the position of the moving target in the second video frame according to the acceleration information, the angular velocity information and the position information of the moving target in the first video frame, and extracting a second feature of the moving target at the position in the second video frame;

the matching module is used for matching the first characteristic with the second characteristic to obtain a matched characteristic;

and the first calculation module is used for obtaining the position feature information of the first feature in the second video frame through the optical flow algorithm on the matched feature.

Specifically, the obtaining module is specifically configured to obtain, by using an acceleration sensor, acceleration information of the moving object in a second video frame;

Specifically, the second extraction module includes:

the first calculation submodule is used for calculating the position change information of the moving target according to the acceleration information and the angular velocity information;

and the second calculation submodule is used for determining the position of the moving target in the second video frame according to the position information of the moving target in the first video frame and the position change information.

Specifically, the first calculation module includes:

the fusion submodule is used for fusing the matching feature and the second feature to obtain a common feature of the matching feature and the second feature;

and the third calculation submodule is used for obtaining the position characteristic information of the first characteristic in the second video frame by adopting an optical flow algorithm on the common characteristic and the first characteristic.

Specifically, the third computation submodule includes:

a first calculation unit configured to calculate a scaling and a rotation scale of the common feature with respect to the first feature;

and the second calculation unit is used for obtaining the position characteristic information of the first characteristic in the second video frame by the common characteristic, the scaling and the rotation scale through a preset tracking algorithm.

Specifically, the apparatus further comprises:

the voting module is used for voting the common characteristics, the scaling and the rotation scale to generate a voting space;

the clustering module is used for clustering the voting space;

and the counting module is used for counting the length of the clustered voting space.

Specifically, the apparatus further comprises:

and the filtering module is used for performing Kalman filtering on the position characteristic information to obtain the position information of the moving target in the second video frame when the length of the clustered voting space counted by the counting module is greater than a preset threshold value.

An embodiment of the present invention provides an electronic device, which includes a processor and a machine-readable storage medium, where the machine-readable storage medium stores machine-executable instructions capable of being executed by the processor, and when the processor executes the machine-executable instructions, the method steps described above are implemented.

An embodiment of the present invention provides a computer-readable storage medium, in which a computer program is stored, and the computer program, when executed by a processor, implements the method steps as described above.

Embodiments of the present invention also provide a computer program product containing instructions, which when run on a computer, cause the computer to perform a method for visual tracking of a moving object as described above.

Embodiments of the present invention also provide a computer program, which when run on a computer, causes the computer to execute a method for visually tracking a moving object as described above.

The embodiment of the invention provides a visual tracking method, a visual tracking device, electronic equipment and a storage medium for a moving target, which comprises the steps of firstly determining the moving target to be tracked in a first video frame, determining the position information of the moving target in the first video frame, and extracting the first characteristic of the moving target in the first video frame; acquiring acceleration information and angular velocity information of the moving target in a second video frame, and dynamically changing the search range of the moving target by introducing the acceleration information and the angular velocity information so as to improve the calculation efficiency; calculating the position of the moving object in the second video frame according to the acceleration information, the angular velocity information and the position information of the moving object in the first video frame, and extracting a second feature at the position in the second video frame; matching the first characteristic with the second characteristic to obtain a matched characteristic; and obtaining the position feature information of the first feature in the second video frame by the optical flow algorithm through the matched feature. The method provided by the embodiment of the invention can improve the real-time performance of the tracking algorithm and improve the tracking efficiency. Of course, it is not necessary for any product or method of practicing the invention to achieve all of the above-described advantages at the same time.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a flowchart of a method for visually tracking a moving object according to an embodiment of the present invention;

FIG. 2 is a flowchart of a method for visually tracking a moving object according to another embodiment of the present invention;

FIG. 3 is a flowchart of a method for visually tracking a moving object according to another embodiment of the present invention;

FIG. 4 is a flowchart of a method for visually tracking a moving object according to another embodiment of the present invention;

FIG. 5a is a flowchart of a method for visually tracking a moving object according to yet another embodiment of the present invention;

FIG. 5b is a flowchart of a method for visually tracking a moving object according to yet another embodiment of the present invention;

FIG. 6 is a schematic structural diagram of a moving object visual tracking apparatus according to an embodiment of the present invention;

FIG. 7 is a schematic structural diagram of a moving object visual tracking apparatus according to another embodiment of the present invention;

FIG. 8 is a schematic structural diagram of a moving object visual tracking apparatus according to yet another embodiment of the present invention;

FIG. 9 is a schematic structural diagram of a moving object visual tracking apparatus according to yet another embodiment of the present invention;

FIG. 10 is a schematic structural diagram of a moving object visual tracking apparatus according to still another embodiment of the present invention;

fig. 11 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

In the prior art, a tracking algorithm based on local information is easy to fail in tracking when a target is deformed or shielded, sliding window matching is generally needed, and the calculated amount is large; the tracking algorithm based on detection can track the targets with gradual change by updating the model, but the tracking algorithm based on detection can only give the position information of the targets generally, but can not obtain the attitude information (rotation angle and the like) of the targets; the tracking algorithm based on the feature points can obtain corresponding matched feature points, estimates information such as the position, the posture and the like of a target by a least square method, and can adapt to certain shielding and deformation, but although the CMT algorithm has good tracking performance and high algorithm efficiency, the CMT algorithm cannot completely meet the tracking requirement for mobile terminal equipment requiring calculation efficiency and power consumption, and has higher requirement on the accuracy of the feature points, the feature points extracted in practical application usually have small-range errors, and the target tracking application requiring more stability is difficult to meet, such as enhanced technology (AR) mapping and the like.

In order to improve the real-time performance and stability of the visual tracking algorithm and improve the calculation efficiency, embodiments of the present invention provide a moving target visual tracking method, apparatus, electronic device, and storage medium, which are described in detail below.

Fig. 1 is a flowchart of a moving target visual tracking method according to an embodiment of the present invention, including the following steps:

step 101, determining a moving object to be tracked in a first video frame, determining position information of the moving object in the first video frame, and extracting a first feature of the moving object in the first video frame.

The visual tracking method provided by the embodiment of the invention can be applied to electronic equipment such as portable notebooks, desktop computers, smart phones and the like. The input received by the processor of the electronic device may be a plurality of video frames, and the plurality of video frames may be a group of temporally adjacent video frames obtained by shooting the same moving object; the plurality of video frames can also be obtained by real-time shooting by using a smart phone and the like; the plurality of video frames may also be obtained from a gallery of the electronic device. Any video frame in a plurality of video frames received by the electronic equipment can be used as the first video frame.

When receiving the first video frame, the processor may place the moving object to be tracked in a rectangular frame by determining the side length L of the moving object frame, and determine the position information of the moving object, where the position information may include: the space coordinates of the moving object in the first video frame, the included angle between the moving object and the horizontal plane in the first video frame and the like. In the first video frame, the moving object to be tracked is called as foreground, and the part outside the rest frames is called as background. All feature points including the foreground and the background in the first video frame are detected, an object in the frame, namely the feature of the moving target, is extracted, the first video frame can be directly read through OpenCV, the feature of the moving target is extracted, and the feature is used as the first feature.

Extracting the characteristics of the moving object comprises extracting color characteristics, texture characteristics, shape characteristics and the like of the moving object, and for the moving object, extracting the moving characteristics can be further included. The color feature is a global feature, and is a feature based on a pixel point, and because the color is insensitive to changes of the direction, the size and the like of an image or an image area, the color feature cannot well capture the feature of a moving object in the image. Color histograms are the most common methods for describing color features, which have the advantage of being immune to image rotation and translation variations, and further immune to image scale variations by means of normalization, and have the disadvantage of not expressing information about the color spatial distribution. The description of the color features can also be performed by color sets, color moments, and color correlation maps.

A texture feature is also a global feature that also describes the surface properties of the scene to which the image or image area corresponds. However, since texture is only a characteristic of the surface of the moving object and cannot completely reflect the essential attributes of the moving object, high-level image content cannot be obtained by using texture features alone. Unlike color features, texture features are not based on the characteristics of the pixel points, which requires statistical calculations in regions containing multiple pixel points. In pattern matching, such regional features have great superiority, and matching is not unsuccessful due to local deviation. As a statistical feature, the texture feature often has rotation invariance and is resistant to noise. However, texture features have the disadvantage that the calculated texture may deviate significantly when the resolution of the image changes. The description of the texture features can be performed by statistical methods, geometric methods, model methods, signal processing methods.

The shape characteristics are characterized in that only the local properties of the moving target are described, and high requirements on calculation time and storage capacity are required for fully describing the moving target; the shape information of the moving object reflected by many shape features is not completely consistent with the visual perception of human, or the similarity of the feature space is different from the similarity perceived by the visual system of human. The description of the shape feature may be performed by a boundary feature method, a fourier shape descriptor method, a geometric parameter method, or the like.

And 102, acquiring acceleration information and angular velocity information of the moving object in a second video frame.

After the processor extracts the first feature of the moving object in the first video frame, a second video frame can be obtained through real-time shooting or directly from a local gallery, wherein the second video frame is the next video frame of the first video frame, and the first video frame and the second video frame are adjacent video frames in terms of time. And acquiring acceleration information and angular velocity information of the moving target to be tracked in the second video frame, wherein the acceleration information and the angular velocity information can represent the position and posture change of the moving target in the second video frame.

A specific method for acquiring acceleration information and angular velocity information of a moving object in a second video frame is as follows: acquiring acceleration information of the moving target in a second video frame through an acceleration sensor; and acquiring the angular speed information of the moving object in the second video frame through the gyroscope sensor.

The acceleration sensor and the gyroscope sensor can be pre-installed in an electronic device, for example, a smart phone, a Micro Electro Mechanical Systems (MEMS) gyroscope can be installed in the smart phone, an angular velocity is obtained by measuring coriolis acceleration generated by rotation, and acceleration information is obtained by measuring the acceleration. When the moving target is shot by using the smart phone, continuous shooting is carried out in the moving direction of the moving target, so that the smart phone and the moving target keep relatively static, the acceleration information and the angular velocity information obtained by the acceleration sensor and the gyroscope sensor are the motion information of the smart phone, the motion information of the moving target can be further determined, and the acceleration information and the angular velocity information of the moving target in a second video frame can be obtained by pre-installing a gravity sensor, a direction sensor, an attitude sensor and the like in the electronic equipment and obtaining the motion information of the electronic equipment.

And 103, calculating the position of the moving object in the second video frame according to the acceleration information, the angular velocity information and the position information of the moving object in the first video frame, and extracting a second feature of the moving object at the position in the second video frame.

After the acceleration information and the angular velocity information of the moving object in the second video frame are obtained, the position of the moving object in the second video frame can be calculated according to the acceleration information and the angular velocity information and the position information of the moving object in the first video frame. The acceleration information and the angular velocity information of the moving object in the second video frame reflect the position and posture change of the moving object in the second video frame, and the position of the moving object in the second video frame can be obtained according to the position and posture change. The position obtained in this step is a candidate position for a rough estimation of the moving object in the second video frame.

As a specific method for obtaining the position of the moving object in the second video frame in the embodiment of the present invention, the method may include: assuming that the center of the first video frame is u, the center of the position of the moving object in the second video frame is also u, and the side length of the frame is Lt ═ β s Δ a_tL, where L is the side length of the frame of the moving object, β is a common parameter, and is set to 2 in the embodiment of the present invention, s is the object scaling system of the moving object to be tracked in the previous frame, Δ a_tIs a weighted sum of acceleration information and angular velocity information of the moving object.

As can be known from the side length formula, when the weighted sum of the obtained acceleration information and the obtained angular velocity information is larger, the side length of the frame is larger, the frame of the moving object is larger, and the search range of the moving object by the method provided by the embodiment of the present invention is larger. Similarly, when the weighted sum of the obtained acceleration information and the obtained angular velocity information is smaller, the side length of the frame is smaller, the frame of the moving object is smaller, and the search range of the moving object by the method provided by the embodiment of the invention is smaller.

In the embodiment of the invention, the searching range of the moving target can be dynamically changed by introducing the acceleration information and the angular velocity information, the calculation efficiency can be improved, the searching range of the moving target can be properly increased when the electronic equipment moves greatly, and the tracking failure caused by the over-small searching range is avoided.

After the position of the moving object in the second video frame is obtained, the processor can directly read the moving object to be tracked in the second video frame through the OpenCV, and extract a second feature of the moving object in the second video frame on the position. The method for extracting the second feature is the same as that instep 101, and includes extracting color features, texture features, shape features and the like of the moving object, and for the moving object, may also include extracting the moving features. It should be noted that, in this step, the extracted second feature is identical to the extracted first feature, and if, instep 101, when the color feature of the moving object is extracted, and the color feature is described by using a color histogram, the extracted second feature is also a color feature; when the texture feature and the shape feature are used for describing the feature of the moving object, the feature needs to be consistent with the first feature. The reason why the second feature is consistent with the first feature is to facilitate subsequent matching for features extracted under the same standard.

And 104, matching the first characteristic with the second characteristic to obtain a matched characteristic.

The first feature and the second feature obtained instep 101 and step 103 are matched, and the specific matching method may be determined according to different extracted features. For example, when extracting the color feature of the moving object, when describing the color feature by using a color histogram, the color feature can be matched by a histogram intersection method, a distance method, a center distance method, a reference color table method, an accumulated color histogram method, or the like; when the texture features of the moving object are extracted, the texture features can be matched through gray level co-occurrence matrixes, wavelet transformation and the like. When the shape feature of the moving object is extracted, the shape feature matching based on the wavelet and the relative moment can be performed. And matching the first feature with the second feature to obtain a matched feature, wherein the matched feature represents similar features in the first feature and the second feature.

And 105, obtaining the position feature information of the first feature in the second video frame by the matched feature through an optical flow algorithm.

After obtaining the matching features of the first feature and the second feature, calculating the position feature information of the first feature in the second video frame, and obtaining the position feature information of the first feature in the second video frame by using an optical flow algorithm through the matching features, wherein the optical flow algorithm is generally applied to track the features in the continuous frames of the video. The method is a method for calculating the motion information of a moving object between adjacent video frames by finding the corresponding relation between the previous video frame and the current video frame by using the change of pixels in a video frame sequence in a time domain and the correlation between the adjacent frames. And calculating the position feature information of the first feature in the first video in the second video frame by using the similar feature in the first feature and the second feature through an optical flow algorithm.

The moving target visual tracking method provided by the embodiment of the invention comprises the steps of firstly determining a moving target to be tracked in a first video frame, determining the position information of the moving target in the first video frame, and extracting the first characteristic of the moving target in the first video frame; acquiring acceleration information and angular velocity information of the moving target in a second video frame, and dynamically changing the search range of the moving target by introducing the acceleration information and the angular velocity information so as to improve the calculation efficiency; calculating the position of the moving object in the second video frame according to the acceleration information, the angular velocity information and the position information of the moving object in the first video frame, and extracting a second feature at the position in the second video frame; matching the first characteristic with the second characteristic to obtain a matched characteristic; and obtaining the position feature information of the first feature in the second video frame by the optical flow algorithm through the matched feature. The method provided by the embodiment of the invention can improve the real-time performance of the tracking algorithm and improve the tracking efficiency.

As another specific implementation manner of the present invention, in combination with the above embodiments, instep 103, a flowchart of a method for calculating a position of a moving object in a second video frame according to acceleration information and angular velocity information, and position information of the moving object in a first video frame is shown in fig. 2, and includes the following steps:

andstep 1031, calculating position change information of the moving object according to the acceleration information and the angular velocity information.

After the processor acquires the acceleration information and the angular velocity information of the moving object in the second video frame, the position change information of the moving object in the second video frame can be calculated. For example, the motion acceleration of the moving object may be obtained according to the acceleration information, the motion direction and the rotation angle of the moving object may be obtained according to the angular velocity information, and the motion direction and the motion distance of the moving object, that is, the position change information relative to the first video frame may be obtained according to the motion acceleration, the motion direction and the rotation angle of the moving object.

And step 1032, determining the position of the moving object in the second video frame according to the position information of the moving object in the first video frame and the position change information.

According to the position information of the moving object in the first video frame and the position change information of the moving object in the second video frame relative to the first video frame instep 101, the change of the moving object in the first video frame can be known, and the position of the moving object in the second video frame can be obtained by combining the position information in the first video frame.

In the embodiment of the invention, the position change information of the moving target is determined and calculated according to the acceleration and the angular velocity information to obtain the position change of the moving target, and the position of the moving target in the second video frame can be obtained by combining the position information in the first video frame, so that the obtained position has real-time property.

As another embodiment of the present invention, a flowchart of a method for obtaining location feature information of a first feature in a second video frame by an optical flow algorithm using a matching feature instep 105 is shown in fig. 3, and includes the following steps:

and 1051, fusing the matching features and the second features to obtain common features of the matching features and the second features.

And after the first feature and the second feature are matched, generating a new feature by the obtained matched feature and the extracted second feature through a fusion method, and obtaining the common feature of the matched feature and the second feature.

And 1052, adopting an optical flow algorithm to the common feature and the first feature to obtain position feature information of the first feature in the second video frame.

And respectively adopting an optical flow algorithm to the obtained common characteristic and the first characteristic to obtain the position characteristic information of the first characteristic in the second video frame, wherein the optical flow algorithm is a method for finding out the corresponding relation between the previous video frame and the current video frame by utilizing the change of pixels in the video frame sequence on a time domain and the correlation between adjacent frames so as to calculate the motion information of the moving object between the adjacent video frames. And (4) the common features and the first features are processed by an optical flow algorithm, so that the position feature information of the first features in the first video in the second video frame can be calculated.

As another specific implementation manner of the present invention, in combination with the above embodiment, instep 1052, a flowchart of a method for obtaining location feature information of a first feature in a second video frame by using an optical flow algorithm for a common feature and the first feature is shown in fig. 4, and includes the following steps:

step 1052a, a scaling and rotation scale of the common feature relative to the first feature is calculated.

After the first feature and the second feature are matched, the obtained matched feature and the extracted second feature are subjected to fusion to generate a new feature, and after the common feature of the matched feature and the second feature is obtained, the scaling and the rotation scale of the common feature relative to the first feature are respectively calculated. Calculating the relative distance and the relative angle between the common characteristic and the first characteristic in pairs, comparing the common characteristic with the first characteristic, and calculating the scaling of the common characteristic relative to the first characteristic.

As can be known from the side length formula of the frame of the moving object in the second video frame, when the weighted sum of the obtained acceleration information and the obtained angular velocity information is larger, the side length is larger, the frame of the moving object is larger, the search range of the moving object by the method provided by the embodiment of the present invention is larger, and the amplification ratio of the common feature to the first feature is larger. Similarly, when the weighted sum of the obtained acceleration information and the obtained angular velocity information is smaller, the side length is smaller, the frame of the moving object is smaller, the search range of the moving object by the method provided by the embodiment of the invention is smaller, and the reduction ratio of the common feature relative to the first feature is smaller; the rotation ratio can be obtained by the same calculation method.

And 1052b, obtaining the position characteristic information of the first characteristic in the second video frame by the common characteristic, the scaling and the rotation scale through a preset tracking algorithm.

And calculating each data feature point obtained in the scaling, rotation scale and common features obtained in thestep 1052a through a preset tracking algorithm, such as a CMT algorithm, a local feature extraction algorithm and the like, to obtain the position feature information of the first feature in the second video frame.

As another embodiment of the present invention, the method after obtaining the position feature information of the first feature in the second video frame by the optical flow algorithm using the matching feature instep 105 is shown in fig. 5a, and further includes the following steps:

and step 106, voting the common features, the scaling and the rotation scale to generate a voting space.

After the position feature information of the first feature in the second video frame is obtained through the optical flow algorithm by the matching features, voting is carried out on the common feature, the scaling ratio and the rotation ratio, and the principle of voting operation is that the relative distance of the feature points relative to the center is relatively unchanged after the scaling ratio and the rotation ratio are taken into consideration, namely the position of the feature points relative to the center of the next frame is unchanged. However, due to the change of the image itself, it is impossible to obtain the exact same relative position, and in this case, some feature points are close to the center, and some feature points are greatly deviated. Then, by using the clustering method, the largest class can be selected as the best feature point.

And calculating the voting value of each feature point according to the data in the common features, the scaling and the rotation scale, wherein a plurality of feature points are obtained through voting, each feature point has high feature strength and accurate positioning, and a feature vector is formed by the plurality of feature points to obtain a voting space.

Step 107, clustering the voting space.

And clustering the generated voting space, wherein clustering is a data analysis method and can gather the feature points with larger dependency relationship in the voting space, and the clustered voting space consists of the feature points with larger dependency relationship and is a sub-vector of a feature vector consisting of the feature points.

And step 108, counting the length of the clustered voting space.

The clustered voting space is a feature sub-vector composed of feature points with a large dependency relationship, the length of the feature sub-vector is calculated, and the obtained length value is the length of the clustered voting space.

In some examples, the method provided by the embodiment of the present invention may further include, as shown in fig. 5 b:

and step 109, when the length of the clustered voting space is greater than a preset threshold, performing Kalman filtering on the position characteristic information to obtain position information of the moving target in the second video frame.

The accuracy problem of the feature points in the position feature information has noise, and the stability of the tracking result can be influenced, so that the influence of the noise can be removed through Kalman filtering. In particular, it can be according to the formula R_t＝R/Δa_tKalman filtering is performed, where R is the initial noise covariance, Δ a_tInformation on the change of the current video frame relative to the previous video frame, R, obtained for the sensor_tIs the noise covariance of the current video frame. Reducing covariance R of measurement noise when electronic device is moving faster_tTo reduce the degree of hysteresis of the kalman filter. Increasing the covariance R of the noise when the electronic device is moving slowly_tAnd the filtering result is smoother and more stable.

When the length of the clustered voting space counted instep 108 is greater than the preset threshold, that is, there are more feature points with a greater dependency relationship, and the obtained features are compared and matched, at this time, the latest parameters of the rectangular frame are calculated, and the position information of the moving target in the frame is the position feature information of the moving target to be tracked, that is, the preliminary tracking result, but the preliminary tracking result contains noise, and a stable tracking result can be obtained only after the position feature information is subjected to kalman filtering; if the length is smaller than the preset threshold value, namely the rectangular frame is too small, the moving target cannot be framed, and the tracking fails.

The embodiment of the present invention further provides a moving target visual tracking apparatus, a structure diagram of which is shown in fig. 6, including:

afirst extraction module 601, configured to determine a moving object to be tracked in a first video frame, and extract a first feature of the moving object in the first video frame;

an obtainingmodule 602, configured to obtain acceleration information and angular velocity information of the moving object in a second video frame;

a second extractingmodule 603, configured to calculate a position of the moving object in the second video frame according to the acceleration information and the angular velocity information, and extract a second feature of the moving object at the position in the second video frame;

amatching module 604, configured to match the first feature with the second feature to obtain a matching feature;

and thefirst calculating module 605 is configured to obtain the position feature information of the first feature in the second video frame by using the optical flow algorithm on the matched feature.

The moving target visual tracking device provided by the embodiment of the invention firstly determines a moving target to be tracked in a first video frame, determines the position information of the moving target in the first video frame, and extracts the first characteristic of the moving target in the first video frame; acquiring acceleration information and angular velocity information of the moving target in a second video frame, and dynamically changing the search range of the moving target by introducing the acceleration information and the angular velocity information so as to improve the calculation efficiency; calculating the position of the moving object in the second video frame according to the acceleration information, the angular velocity information and the position information of the moving object in the first video frame, and extracting a second feature at the position in the second video frame; matching the first characteristic with the second characteristic to obtain a matched characteristic; and obtaining the position feature information of the first feature in the second video frame by the optical flow algorithm through the matched feature. The method provided by the embodiment of the invention can improve the real-time performance of the tracking algorithm and improve the tracking efficiency.

Specifically, the obtainingmodule 602 is specifically configured to obtain, by using an acceleration sensor, acceleration information of the moving object in the second video frame;

and acquiring the angular speed information of the moving object in the second video frame through the gyroscope sensor.

Specifically, the structure diagram of the second extractingmodule 603 is shown in fig. 7, and includes:

afirst calculation submodule 6031 configured to calculate position change information of the moving target according to the acceleration information and the angular velocity information;

and a second calculating sub-module 6032, configured to determine a position of the moving object in the second video frame according to the position information of the moving object in the first video frame and the position change information.

Specifically, the structure diagram of thefirst calculating module 606 is shown in fig. 8, and includes:

a fusion submodule 6061, configured to fuse the matching feature with the second feature to obtain a common feature of the matching feature and the second feature;

and a third computing submodule 6062 configured to apply an optical flow algorithm to the common feature and the first feature to obtain position feature information of the first feature in the second video frame.

Specifically, the structure of the third computing submodule 6062 is shown in fig. 9, and includes:

a first calculation unit 60621 for calculating a scaling and a rotation scale of the common feature with respect to the first feature;

and a second calculating unit 60622, configured to pass the common feature, the scaling and the rotation scale through a preset tracking algorithm to obtain the location feature information of the first feature in the second video frame.

Specifically, a structure diagram of the moving object visual tracking apparatus provided by the embodiment of the present invention is shown in fig. 10, and further includes:

avoting module 606, configured to vote on the common features, the scaling ratio, and the rotation ratio to generate a voting space;

aclustering module 607 for clustering the voting space;

astatistic module 608, configured to count lengths of the clustered voting spaces;

specifically, the apparatus provided in the embodiment of the present invention further includes:

and a filtering module, configured to perform kalman filtering on the position feature information to obtain position information of the moving object in the second video frame when the length of the clustered voting space counted by thecounting module 608 is greater than a preset threshold.

The embodiment of the invention provides electronic equipment, which comprises a processor and a machine-readable storage medium, wherein the machine-readable storage medium stores machine-executable instructions capable of being executed by the processor, and when the processor executes the machine-executable instructions, the following method steps are realized:

acquiring acceleration information and angular velocity information of the moving target in a second video frame; the second video frame is the next video frame of the first video frame;

and obtaining the position feature information of the first feature in the second video frame by the optical flow algorithm through the matched feature.

An embodiment of the present invention further provides an electronic device, as shown in fig. 11, including aprocessor 1101, acommunication interface 1102, amemory 1103 and acommunication bus 1104, where theprocessor 1101, thecommunication interface 1102 and thememory 1103 complete mutual communication through thecommunication bus 1104,

amemory 1103 for storing a computer program;

theprocessor 1101 is configured to implement the following steps when executing the program stored in the memory 1103:

The electronic device provided by the embodiment of the invention firstly determines a moving target to be tracked in a first video frame, determines the position information of the moving target in the first video frame, and extracts the first characteristic of the moving target in the first video frame; acquiring acceleration information and angular velocity information of the moving target in a second video frame, and dynamically changing the search range of the moving target by introducing the acceleration information and the angular velocity information so as to improve the calculation efficiency; calculating the position of the moving object in the second video frame according to the acceleration information, the angular velocity information and the position information of the moving object in the first video frame, and extracting a second feature at the position in the second video frame; matching the first characteristic with the second characteristic to obtain a matched characteristic; and obtaining the position feature information of the first feature in the second video frame by the optical flow algorithm through the matched feature. The embodiment of the invention can improve the real-time performance of the tracking algorithm and improve the tracking efficiency.

The embodiment of the invention provides a computer readable storage medium, a computer program is stored in the computer readable storage medium, and when being executed by a processor, the computer program realizes the following method steps:

matching the first characteristic with the second characteristic to obtain a matched characteristic; and obtaining the position feature information of the first feature in the second video frame by the optical flow algorithm through the matched feature.

The computer-readable storage medium provided by the embodiment of the invention first determines a moving target to be tracked in a first video frame, determines position information of the moving target in the first video frame, and extracts a first feature of the moving target in the first video frame; acquiring acceleration information and angular velocity information of the moving target in a second video frame, and dynamically changing the search range of the moving target by introducing the acceleration information and the angular velocity information so as to improve the calculation efficiency; calculating the position of the moving object in the second video frame according to the acceleration information, the angular velocity information and the position information of the moving object in the first video frame, and extracting a second feature at the position in the second video frame; matching the first characteristic with the second characteristic to obtain a matched characteristic; and obtaining the position feature information of the first feature in the second video frame by the optical flow algorithm through the matched feature. The embodiment of the invention can improve the real-time performance of the tracking algorithm and improve the tracking efficiency.

The computer program product containing the instructions provided by the embodiment of the invention firstly determines a moving target to be tracked in a first video frame, determines the position information of the moving target in the first video frame, and extracts the first characteristic of the moving target in the first video frame; acquiring acceleration information and angular velocity information of the moving target in a second video frame, and dynamically changing the search range of the moving target by introducing the acceleration information and the angular velocity information so as to improve the calculation efficiency; calculating the position of the moving object in the second video frame according to the acceleration information, the angular velocity information and the position information of the moving object in the first video frame, and extracting a second feature at the position in the second video frame; matching the first characteristic with the second characteristic to obtain a matched characteristic; and obtaining the position feature information of the first feature in the second video frame by the optical flow algorithm through the matched feature. The embodiment of the invention can improve the real-time performance of the tracking algorithm and improve the tracking efficiency.

The computer program provided by the embodiment of the invention firstly determines a moving target to be tracked in a first video frame, determines the position information of the moving target in the first video frame, and extracts the first characteristic of the moving target in the first video frame; acquiring acceleration information and angular velocity information of the moving target in a second video frame, and dynamically changing the search range of the moving target by introducing the acceleration information and the angular velocity information so as to improve the calculation efficiency; calculating the position of the moving object in the second video frame according to the acceleration information, the angular velocity information and the position information of the moving object in the first video frame, and extracting a second feature at the position in the second video frame; matching the first characteristic with the second characteristic to obtain a matched characteristic; and obtaining the position feature information of the first feature in the second video frame by the optical flow algorithm through the matched feature. The embodiment of the invention can improve the real-time performance of the tracking algorithm and improve the tracking efficiency.

The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components.

It should be noted that the apparatus, the electronic device, the storage medium, the computer program product containing the instructions, and the computer program provided in the embodiments of the present invention are respectively an apparatus, an electronic device, a storage medium, a computer program product containing the instructions, and a computer program that apply the above-mentioned moving object visual tracking method, and all embodiments of the above-mentioned moving object visual tracking method are applicable to the apparatus, the electronic device, the storage medium, the computer program product containing the instructions, and the computer program, and can achieve the same or similar beneficial effects.

It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.

The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.