CN103077539A

Movatterモバイル変換

Info

Publication number: CN103077539A
Application number: CN2013100243579A
Authority: CN
Inventors: 肖刚; 许晓航
Original assignee: Shanghai Jiao Tong University
Current assignee: Shanghai Jiao Tong University
Priority date: 2013-01-23
Filing date: 2013-01-23
Publication date: 2013-05-01
Anticipated expiration: 2033-01-23
Also published as: CN103077539B

Abstract

本发明公开了一种多特征结合Kalman滤波的目标跟踪方法，用于解决复杂背景环境以及遮挡情况下的目标跟踪。提出基于颜色、纹理及目标运动信息的综合特征用于改进CamShift算法，并结合Kalman滤波器对目标运动状态进行预测，提高了复杂背景下运动目标的跟踪稳定性和跟踪精度。颜色信息易受到光照及背景干扰色等因素干扰，引入目标纹理特征改善采用单一颜色信息时的缺陷，并继续加入目标运动信息，进一步排除背景中的干扰。在目标发生遮挡时，通过目标遮挡前的先验信息进行最小二乘拟合及目标运动轨迹外推，预测目标运动位置信息，有利于遮挡结束时对运动目标的重新捕获。

The invention discloses a multi-feature combined Kalman filter target tracking method, which is used to solve the target tracking under complex background environment and occlusion conditions. A comprehensive feature based on color, texture and target motion information is proposed to improve the CamShift algorithm, and the Kalman filter is combined to predict the target motion state, which improves the tracking stability and tracking accuracy of moving targets in complex backgrounds. Color information is easily interfered by factors such as lighting and background interference color. The introduction of target texture features improves the defects when using single color information, and continues to add target motion information to further eliminate background interference. When the target is occluded, the least squares fitting and the extrapolation of the target motion trajectory are carried out through the prior information of the target occlusion, and the target motion position information is predicted, which is beneficial to the recapture of the moving target when the occlusion ends.

Description

A kind of complex background and block motion target tracking method under the condition

Technical field

The present invention relates to a kind of adaptive target tracking method, be specifically related to a kind of complex background and block motion target tracking method under the condition.

Background technology

The tracking of moving target is the position of determining interested moving target in each width of cloth image of video image, and same target in the different frame is mapped.Utilize optical sensor detection of a target under complex background, all significant for the civilian and military field, its related sensor design, information processing and system simulation method are the focuses of researcher and engineering technical personnel's research always.

In the process that ground microscope carrier (as portable, fixed) is surveyed surrounding environment, and in the process that microscope carrier (such as aircraft, dirigible, satellite etc.) is surveyed over the ground, often obtain the image of target and background to carry out target detection recognition and tracking or terrain environment acquisition of information and perception by optical detection system in the air.

The visible image capturing technology is along with the development of charge-coupled device (CCD), cmos imaging device and digital image processing techniques, popularization and application is in every field, high definition visible image capturing video alarming system with intelligent image processing function can be identified different objects automatically, find the abnormal conditions in the monitored picture, and can give the alarm and provide useful information, can be applicable to the multiple occasions such as anti-terrorism, accident disposal, aviation supervision, traffic administration, customer behavior analysis.

Middle and later periods in 20th century, visible light camera develops into colour TV camera by B/W camera.Over nearly 20 years, use in a large number colour TV camera in the video monitoring system that safety-security area uses, by the advantage of CCD and the premium properties that reaches at present, colourful CCD video camera becomes the main flow in the video monitoring system.Except Military Application, based on the Video Supervision Technique of visible image capturing to the future development of high definition networking, image digitazation, monitoring intelligent, a new generation has many-sided significant advantage with the CCD monitor and alarm system of intelligent Image Information Processing function, comprising: round-the-clock 24 hours reliable automatic monitorings; Improve the warning degree of accuracy, reduce wrong report and fail to report phenomenon, reduce the gibberish amount; The runs image processing algorithm improves the identification response speed fast; The purposes of effective utilization and extending video resource.

Traditional Mean Shift and CamShift algorithm can be obtained good effect and have good real-time under simple background, thereby are widely used in the video tracking field to moving target.Mean Shift track algorithm is that a kind of probability distribution take the target area pixel value is the track algorithm of feature, because the optimizing fast convergence rate, this algorithm has good real-time, and has certain robustness.In Mean Shift algorithm, the nuclear window plays very important effect.Usually the nuclear window is determined by initial tracking window, and size immobilizes.But in the tracing process to moving target, if target scale changes, especially target scale becomes when surpassing greatly the nuclear window, is easy to cause the tracking for target to lose efficacy.According to this defective of Mean Shift algorithm, G.Bradski has proposed the CamShift algorithm on the basis of Mean Shift algorithm, and this is a kind of improvement algorithm of Mean Shift algorithm.The maximum of it and Mean Shift algorithm is not both in tracing process can self-adaptation adjustment nuclear window size, to reach the variation that adapts to target scale.But no matter be Mean Shift algorithm or CamShift algorithm, their essence all is to use the color probability distribution information of target, the feature that is histogram when following the tracks of, and histogram is a kind of weak feature, and this is easy to cause when just causing having the large tracts of land interference colour in complex background or background follow the tracks of and lost efficacy.And Mean Shift and CamShift algorithm are not done any prediction to target trajectory, in the target fast reserve or block when losing, often can't proceed effective tracking.

Comprehensive, still there are some difficulties in present video tracking technology for motion target tracking, is mainly derived from following aspect: intensity of illumination and Changes in weather are on the impact of target following; Target appearance change or rotate, the variation of the various complexity such as scaling, displacement; The stability problem of following the tracks of during the target rapid movement; The occlusion issue of moving target; The impact of the complicated factors such as background interference; Correct detection and the segmentation problem of moving target; The problem of data fusion of image is followed the tracks of such as multi-cam; The real time problems of following the tracks of etc.

Although more than ten years video frequency object tracking problem has obtained to study widely and obtained significant progress recently, but because the many situations in the actual environment all can affect in the video image the reliable observation to target, therefore design the method that under various complex environments, accurately, fast, stably to follow the tracks of video object, the problem that remains a challenging task and be badly in need of solving especially can be applied under the complex background and the motion target tracking method under the circumstance of occlusion.

Therefore, those skilled in the art is devoted to develop a kind of complex background and blocks motion target tracking method under the condition.

Summary of the invention

In order to realize complex background and block motion target tracking under the condition that the present invention proposes a kind of complex background of the Kalman of combining wave filter and blocks motion target tracking method under the condition.The present invention adopts the color and vein two-dimensional histogram as clarification of objective, can suppress preferably the impact of illumination; On the basis of target back projection figure, in conjunction with the Motion mask information in the background, realize a kind of improved back projection figure, can effectively remove the interference in the background; Adopt the Kalman wave filter that the motion of target is predicted, increase the robustness of tracking, and the tracking accuracy when having improved the target rapid movement; Block when losing in target, with the least square fitting target trajectory and according to target the priori speed method of extrapolating target state is predicted, after target reappears, detect target and proceed and follow the tracks of, effectively solve the target occlusion problem.

For achieving the above object, the present invention utilizes color, texture, movable information and makes up Camshift in conjunction with Kalman filtering, thereby realizes complex background and block motion target tracking method under the condition.Realized the emulation system of method based on Visual Studio2008 and OpenCV2.0.

Said method of the present invention may further comprise the steps:

(1) determines the initial frame of described target and described video image;

(2) obtain the color of described target and the back projection of texture information two-dimensional histogram, described back projection is for calculating the matching degree of histogram distribution situation in entire image of described target region, the region weight more similar to target signature is larger, then the scope that the gray-scale value of each point is zoomed to 0-255 obtains the new gray level image of a width of cloth, is back projection figure; Described color is the image chroma information of described target, utilizes the histogram of described color, obtains the described color that back projection figure represents described target in described video image; Described colouring information is vulnerable to illumination condition impact, combined with texture information when this method changes at illumination condition; Described texture is the gray level image that the image of described target obtains by calculating gray level co-occurrence matrixes, described texture adopts the texture feature extraction in the described gray level co-occurrence matrixes, utilize the histogram of the described texture of target described in the described gray level image, in described video image, obtain back projection figure, represent the described texture of described two field picture;

(3) the color and vein two-dimensional histogram of described target is in conjunction with the Motion mask information of described target, the back projection figure that is expanded, and described Motion mask information is the foreground image that described video image and background model image obtain as difference;

(4) described target is blocked judgement, if being judged to be described target is not blocked, adopt the Kalman wave filter that described target is predicted in the position of next frame, the result of described prediction adopts CamShift algorithm iteration next frame as the starting point of next frame CamShift algorithm iteration; Be blocked if be judged to be described target, described Kalman wave filter quits work, and adopts and carries out trajectory predictions based on the method for least-squares estimation, simultaneously each predicted position is adopted the CamShift algorithm search;

(5) judge to follow the tracks of whether finish, do not finish if be judged to be tracking, obtain the present frame of video tracking, and go to step (2); Finish if be judged to be to follow the tracks of, then method stops.

Wherein, described color adopts the HSV colour model that colourity, saturation degree, brightness are made a distinction, H representation in components colourity wherein, and the S representation in components is saturated, the brightness of V representation in components.

Further, thus described color adopts the histogram of H component and back projection figure thereof to the described color feature of described target.

Wherein, described texture adopts the Gabor filter filtering to extract.

Further, textural characteristics in the described gray level co-occurrence matrixes that the extraction of described texture is adopted is that the gray scale difference value by each pixel 8 neighborhood produces, described each pixel of described gray level image is got respectively 45 °, 135 °, 90 °, the 0 ° neighborhood gray value differences on the direction, that is:

G1(x,y)=G(x+1,y+1)-G(x-1,y-1) (1)

G2(x,y)=G(x-1,y+1)-G(x+1,y-1) (2)

G3(x,y)=G(x,y+1)-G(x,y-1) (3)

G4(x,y)=G(x+1,y)-G(x-1,y) (4)

Wherein (x, y) is described pixel coordinate.After obtaining 4 neighborhood gray value differences on the direction, described each Pixel gray difference of described texture is defined as

G5(x,y)=[G1(x,y)+G2(x,y)+G3(x,y)+G4(x,y)]/4 (5)

Obtain thus described texture.

Wherein, described color and vein two-dimensional histogram is by being that the histogram calculation method of one dimension monoplane image is generally expanded under the two-dimensional case of the complex plane that color and texture image form; The back projection of the color and vein two-dimensional histogram of described target is that described color and vein two-dimensional histogram Computation distribution match condition on described color and described texture two dimension complex plane image obtains.

Wherein, the color and vein two-dimensional histogram of described target is in conjunction with the Motion mask information of described target, the back projection figure that is expanded, and described Motion mask information is to make foreground image after the difference through background model in every two field picture of video.

Wherein, the color and vein two-dimensional histogram of described target is in conjunction with the Motion mask information of described target, and the back projection figure that is expanded is defined as follows:

\hat{P} (i, j, k) = \{\begin{matrix} P (i, j, k) & M (i, j, k) = 1 \\ 0 & M (i, j, k) = 0 \end{matrix} - - - (10)

Wherein, P (i, j, k) is the back projection figure of described color and vein two-dimensional histogram,Be the back projection figure of described expansion, M (i, j, k) is Motion mask information.

Wherein, described blocking judges it is to judge by observed reading and the residual error between the optimal estimation value of more described target whether target is blocked:

r (k) = \sqrt{{(x (k) - \hat{x} (k))}^{2} + {(y (k) - \hat{y} (k))}^{2}} - - - (13)

In the formula, x and y are respectively the coordinate of barycenter on x axle and y direction of principal axis of described target, and k represents the k frame,

With

Be the estimated value of described target, x (k) and y (k) they are described target observation values, define a threshold alpha, as r (k)〉during α, judge that namely described target blocks, and when then judging described target, r (k)＜α is not blocked.

Wherein, in the described Kalman filtering described target is adopted the uniform motion model, the state vector that defines described target is X (k)=[x (k), y (k), v_x(k), v_y(k)], observation vector is Z (k)=[v_x(k), v_y(k)], x (k) wherein, y (k), v_x(k), v_y(k) be respectively coordinate and the speed of barycenter on x axle and y direction of principal axis of described target, and

v_{x} (k) = \frac{x (k) - x (k - 1)}{Δt},

v_{y} (k) = \frac{y (k) - y (k - 1)}{Δt} .

State-transition matrix Φ (k) is:

Φ (k) = [\begin{matrix} 1 & 0 & Δt & 0 \\ 0 & 1 & 0 & Δt \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{matrix}] - - - (11)

Observing matrix H (k) is:

H (k) = [\begin{matrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \end{matrix}] - - - (12)

Wherein Δ t represents the time interval of t (k) and t (k-1), can represent that with frame number is poor Δ t=1 makes that dynamic noise covariance matrix Q (k), observation noise covariance matrix R (k) are unit matrix; The barycenter of the described target that use detects and window be as the initial input of CamShift algorithm, and the described state vector X of initialization (k) simultaneously, x (k) wherein, y (k) initial value are the centroid position of the described target that detects, v_x(k), v_y(k) be initialized as 0, in the tracing process, at the k frame, from the optimal estimation of the centroid position of the described target of previous frame

Obtain the predicted value of x (k)Adopt simultaneously the CamShift algorithm to calculate the barycenter of described target, and revise described predicted value as observed reading z (k)Obtain the optimal estimation of described x (k)

Then calculate described target barycenter in the next frame predicted value

With wherein the positional information starting point as the search of next frame CamShift algorithm target centroid position.

Method of the present invention can be based on the emulation system of Visual Studio2008 and OpenCV2.0 realization this method.

The present invention is based on the comprehensive characteristics of color, texture and target travel information for improvement of the CamShift algorithm, and in conjunction with the Kalman wave filter target state is predicted, improved tracking stability and the tracking accuracy of mobile target in complex background.Colouring information is vulnerable to the factors such as illumination and background interference look and disturbs, the defective when introducing target texture feature improvement employing solid color information, and continue to add target travel information, further get rid of the interference in the background.When target is blocked, carry out the extrapolation of least square fitting and target trajectory by the prior imformation before the target occlusion, target of prediction movement position information is conducive to block when finishing the recapture to moving target.By the message complementary sense between many features and to the real-time estimate of target, the method in complex background and target by under the circumstance of occlusion in short-term, can realize target continue, tenacious tracking.

Be described further below with reference to the technique effect of accompanying drawing to design of the present invention, concrete structure and generation, to understand fully purpose of the present invention, feature and effect.

Description of drawings

Fig. 1 is complex background and block target tracking algorism process flow diagram under the condition in the preferred embodiment of the present invention.

Embodiment

As shown in Figure 1, in the present embodiment, said method of the present invention may further comprise the steps:

The first step is determined the initial frame of tracking target and corresponding video tracking.

Second step, obtain the color of described target and the back projection of texture information two-dimensional histogram, described back projection is for calculating the matching degree of histogram distribution situation in entire image of described target region, the region weight more similar to target signature is larger, then the scope that the gray-scale value of each point is zoomed to 0-255 obtains the new gray level image of a width of cloth, is back projection figure; Described color is the image chroma information of described target, utilizes the histogram of described color, obtains the described color that back projection figure represents described target in described video image; Described colouring information is vulnerable to illumination condition impact, combined with texture information when this method changes at illumination condition; Described texture is the gray level image that the image of described target obtains by calculating gray level co-occurrence matrixes, described texture adopts the texture feature extraction in the described gray level co-occurrence matrixes, utilize the histogram of the described texture of target described in the described gray level image, in described video image, obtain back projection figure, represent the described texture of described two field picture.

In the present embodiment, colouring information adopts the HSV colour model that colourity, saturation degree, brightness are made a distinction, H representation in components colourity wherein, and the S representation in components is saturated, the brightness of V representation in components.More preferably, because the H component has comprised the colouring information of object, so just can obtain color feature to target with the histogram of H component and back projection thereof.

In the present embodiment, texture is a kind of feature very important in the graphical analysis, is different from the characteristics of image such as color and edge, himself has the good properties such as very strong anti-illuminance abrupt variation characteristic and local sequentiality.Present embodiment considers performance and computational complexity, and the gray scale symbiosis square textural characteristics of employing is to be produced by the gray scale difference value of each pixel 8 neighborhood.Each pixel of gray level image is got respectively 45 °, 135 °, 90 °, the 0 ° neighborhood gray value differences on the direction, that is:

G1(x,y)=G(x+1,y+1)-G(x-1,y-1) (1)

G2(x,y)=G(x-1,y+1)-G(x+1,y-1) (2)

G3(x,y)=G(x,y+1)-G(x,y-1) (3)

G4(x,y)=G(x+1,y)-G(x-1,y) (4)

After obtaining 4 neighborhood gray value differences on the direction, each pixel definition of texture image is

G5(x,y)=[G1(x,y)+G2(x,y)+G3(x,y)+G4(x,y)]/4 (5)

Obtain thus texture image, the grey level histogram of calculating target and back projection thereof are as the texture information of target.

In the 3rd step, the color and vein two-dimensional histogram of described target is in conjunction with the Motion mask information of described target, the back projection figure that is expanded, and described Motion mask information is through making foreground image after the difference with the background model image in every two field picture of video.

In the present embodiment, to color and the texture information fusion of described target, be combined into two-dimensional histogram by color and texture one dimension histogram expansion separately and obtain.And at H component and its back projection of texture two passage images calculating figure.

By the extraction of moving target in each two field picture, can effectively get rid of the interference in the background, dwindle the scope of search and improved tracking accuracy.At first calculate the foreground target in the scene, i.e. Motion mask.For the noise in the removal of images and the higher background authenticity of acquisition, thereby the video image information of certain hour length is weighted the initial background that on average obtains with the real background image approximate.When image sequence passed through this time domain low-pass filter, the gradual part of image sequence can be separated from the quick change procedure of image, and the method for building up of image background model is as follows:

B(i,j,k+1)=B(i,j,k)+g(k)·(I(i,j,k)-B(i,j,k)) (6)

g(k)=β·(1-M(i,j,k))+α·M(i,j,k) (7)

α = \frac{1}{\sqrt{2} π σ_{1}} \exp (- \frac{{(Th - {avg}_{1})}^{2}}{σ_{1}^{2}}) - - - (8)

β = \frac{1}{\sqrt{2 π} σ_{2}} \exp (- \frac{{(Th - {avg}_{2})}^{2}}{σ_{2}^{2}}) - - - (9)

In the formula, the pixel value of I (i, j, k) expression k two field picture (i, j) coordinate, B (i, j, k) is the present frame background image, B (i, j, k+1) be the next frame background image, β is background gactor, and α is molility factor, and α and β are between 0 and 1.They have determined the adaptive process in the context update, and what namely motion parts was more is updated in the prospect, and is judged as more being updated in the background of non-motion parts.Avg₁, σ₁Average and the variance of present image and background image error image, avg₂, σ₂Average and the variance of present image, the foreground image of M (i, j, k) expression binaryzation, Th is binary-state threshold.This formula can be interpreted as a kind of recurrence prediction of background, and it calculates next background constantly by the real-time update of background gactor and molility factor.With the foreground target that detects, as the Motion mask information in the scene, namely M (i, j, k) value can judge whether a certain pixel belongs to moving target.

In conjunction with the Motion mask in the scene, just can calculate final back projection figure, be defined as follows:

\hat{P} (i, j, k) = \{\begin{matrix} P (i, j, k) & M (i, j, k) = 1 \\ 0 & M (i, j, k) = 0 \end{matrix} - - - (10)

Wherein, P (i, j, k) is the back projection figure of color of object-textural characteristics,

Be the back projection figure after the adding Motion mask information, M (i, j, k) is Motion mask information.

The 4th step, described target is blocked judgement, if be judged to be true, be that described target is not blocked, adopt the Kalman wave filter that described target is predicted in the position of next frame, the result of described prediction adopts CamShift algorithm iteration next frame as the starting point of next frame CamShift algorithm iteration; If be judged to be vacation, namely described target is blocked, and when occurring blocking, described Kalman wave filter quits work, and adopts and carries out trajectory predictions based on the method for least-squares estimation, simultaneously each predicted position is adopted the CamShift algorithm search.

When judging that target is blocked, namely can be considered target generation deformation, if the centroid position that continue to use the CamShift algorithm to calculate target this moment is not its actual position, the Kalman wave filter that is formed by this location point information the observed reading of present frame neither be correct observed reading.So present embodiment judges by observed reading and the residual error between the optimal estimation value of comparison object whether target is blocked:

r (k) = \sqrt{{(x (k) - \hat{x} (k))}^{2} + {(y (k) - \hat{y} (k))}^{2}} - - - (13)

In the formula,

With

Be the estimated value of target, x (k) and y (k) are the target observation values.Select a threshold alpha, as r (k)〉during α, namely target is blocked, when r (k)＜α then target be not blocked.

When described target is not blocked, adopt the Kalman wave filter that described target is predicted in the position of next frame, the result of described prediction adopts CamShift algorithm iteration next frame as the starting point of next frame CamShift algorithm iteration.Traditional CamShift algorithm directly uses the target centroid position of present frame as the starting point of next frame iteration, and any prediction is not done in target travel, so track algorithm was lost efficacy.Adopt the color-texture union feature, although and can get rid of interference in the background in conjunction with movable information, if there be the object close with target signature in the prospect, still can produce interference to tracking results.For above problem, introduce

The Kalman wave filter is estimated in the position of next frame the present frame target, and as the starting point of next frame CamShift algorithm iteration, can effectively overcome the above problems.

In the target following process, because the adjacent two frame period times are shorter, can during this period of time regard target as uniform motion in the present embodiment, so adopt the uniform motion model.

The state vector that makes target is X (k)=[x (k), y (k), v_x(k), v_y(k)], observation vector is Z (k)=[v_x(k), v_y(k)], x (k) wherein, y (k), v_x(k), v_y(k) be respectively coordinate and the speed of target barycenter on x axle and y direction of principal axis,

v_{x} (k) = \frac{x (k) - x (k - 1)}{Δt},

v_{y} (k) = \frac{y (k) - y (k - 1)}{Δt} .

Therefore, the state-transition matrix Φ (k) of system is:

Φ (k) = [\begin{matrix} 1 & 0 & Δt & 0 \\ 0 & 1 & 0 & Δt \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{matrix}] - - - (11)

Observing matrix H (k) is:

H (k) = [\begin{matrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \end{matrix}] - - - (12)

Wherein Δ t represents the time interval of t (k) and t (k-1), can represent with frame number is poor, at this moment Δ t=1.Can make that dynamic noise covariance matrix Q (k), observation noise covariance matrix R (k) are unit matrix.

In the experiment, use the barycenter detect moving target and window as the initial input of CamShift algorithm, init state vector X (k) simultaneously, x (k) wherein, y (k) is initialized as the centroid position that detects target, v_x(k), v_y(k) all be initialized as 0.In the tracing process, at k constantly, from the upper one constantly optimal estimation of target centroid position

Obtain the predicted value of x (k)The CamShift algorithm calculates the barycenter of target simultaneously, and revises predicted value as observed reading z (k)

Obtain the optimal estimation of x (k)

Then will

In the next frame predicted value

In centroid position as the input of next frame CamShift algorithm.In each frame, utilize Kalman filtering that the target centroid position is estimated like this, improve tracking effect.

If be judged to be vacation, namely described target is blocked, and when occurring blocking, described Kalman wave filter quits work, and adopts and carries out trajectory predictions based on the method for least-squares estimation, simultaneously each predicted position is adopted the CamShift algorithm search.

When occurring blocking, the Kalman wave filter quits work, tracking is converted into based on the method for least-squares estimation carries out trajectory predictions, whether the search target occurs near each predicted position simultaneously, if decision criteria judges that target still is in the state of being blocked, then continue to do trajectory predictions, if the unshielding state, then recover CamShift and Kalman filter tracks, and upgrade the Kalman filter status with new observed reading.

For time series { x₁, x₂, x₃... x_n, i=1,2...n, { x_nBe the desired value of prediction, { x_N-1, x_N-2, x_N-3... x_N-mBe known correlative, and carry out fitting a straight line with least square method, namely obtain straight line and make { x_N-1, x_N-2, x_N-3... x_N-mError sum of squares minimum.To blocking front m frame target location { x_N-1, x_N-2, x_N-3... x_N-mCarry out least square fitting, can obtain the linear regression model (LRM) of target location.

Can set up being predicted as of the following first step thus

Being predicted as of following second step

And finish by that analogy when blocking prediction to the target location.

At the target in video image tracking of present embodiment, the CamShift algorithm flow is as follows:

(1) video image is converted into the HSV space from rgb color space;

(2) objective definition calculates the histogram of this zone H component in the prime area of tracking initiation frame;

(3) calculate this histogram at the probability distribution image of each frame, i.e. intensity profile matching degree obtain the back projection figure of each frame, and the pixel value that back projection figure is had a few zooms in [0,255] scope;

(4) search for the Camshift algorithm at each frame back projection figure, find the zone of mating the most with the initial target histogram distribution, this zone is the target following position;

(5) target location that searches with present frame is as the starting point of next frame Camshift algorithm search.

The 5th step, judge to follow the tracks of and whether finish, if be judged to be very, namely follow the tracks of and finish, obtain the present frame of video tracking, and go to second step; If be judged to be vacation, namely follow the tracks of and finish, then method stops.

Further, method of the present invention can be based on the emulation system of Visual Studio2008 and OpenCV2.0 realization this method.

More than describe preferred embodiment of the present invention in detail.The ordinary skill that should be appreciated that this area need not creative work and just can design according to the present invention make many modifications and variations.Therefore, all in the art technician all should be in the determined protection domain by claims under this invention's idea on the basis of existing technology by the available technical scheme of logical analysis, reasoning, or a limited experiment.

Claims

Translated fromChinese

1.一种复杂背景及遮挡条件下的运动的目标的跟踪方法，用于跟踪视频图像中的运动的目标，其特征在于，包括步骤：1. a tracking method of a moving target under complex background and occlusion conditions, for tracking the moving target in the video image, it is characterized in that, comprising steps:

（1）确定所述目标及所述视频图像的初始帧；(1) Determining the target and the initial frame of the video image;

（2）获得所述目标的颜色及纹理信息二维直方图的反向投影，所述颜色是所述目标的图像色度信息，利用所述颜色的直方图，在所述视频图像中得到反向投影图来表示所述目标的所述颜色；所述纹理是所述目标的图像通过计算灰度共生矩阵得到的灰度图像，所述纹理采用所述灰度共生矩阵中的纹理特征提取，利用所述灰度图像中所述目标的所述纹理的直方图，在所述视频图像中得到反向投影图，来表示所述帧图像的所述纹理；(2) Obtain the back projection of the two-dimensional histogram of the color and texture information of the target, the color is the image chromaticity information of the target, and use the histogram of the color to obtain the reverse projection in the video image Representing the color of the target to a projection image; the texture is a grayscale image obtained by calculating a grayscale co-occurrence matrix of the image of the target, and the texture is extracted using texture features in the grayscale co-occurrence matrix, using the histogram of the texture of the object in the grayscale image to obtain a back-projection image in the video image to represent the texture of the frame image;

（3）所述目标的颜色纹理二维直方图结合所述目标的运动模板信息，得到扩展的反向投影图，所述运动模板信息是所述视频图像与背景模型图像作差值得到的前景图像；(3) The two-dimensional histogram of the color texture of the target is combined with the motion template information of the target to obtain an extended back projection image, and the motion template information is the foreground obtained by the difference between the video image and the background model image image;

（4）对所述目标进行遮挡判定，若判定为所述目标未被遮挡，采用Kalman滤波器对所述目标在下一帧的位置进行预测，所述预测的结果作为下一帧CamShift算法迭代的起点，采用CamShift算法迭代下一帧；若判定为所述目标被遮挡，所述Kalman滤波器停止工作，采用基于最小二乘估计的方法进行轨迹预测，同时对每一预测位置采用CamShift算法搜索；(4) Perform an occlusion judgment on the target. If it is determined that the target is not occluded, use the Kalman filter to predict the position of the target in the next frame, and the prediction result is used as the iteration of the CamShift algorithm in the next frame. Starting point, adopting the CamShift algorithm to iterate the next frame; if it is determined that the target is blocked, the Kalman filter stops working, and the method based on least squares estimation is used for trajectory prediction, and the CamShift algorithm is used for each predicted position to search;

（5）判断跟踪是否结束，若判定为跟踪未结束，获取视频跟踪的当前帧，并转至步骤（2）；若判定为跟踪结束，则方法停止。(5) Determine whether the tracking is over, if it is determined that the tracking is not over, obtain the current frame of the video tracking, and go to step (2); if it is determined that the tracking is over, the method stops.

2.如权利要求1所述的一种复杂背景及遮挡条件下的运动目标跟踪方法，其特征在于，步骤（2）中所述颜色采用HSV色彩模型将所述颜色的色度、饱和度、亮度区分开来，其中H分量表示色度，S分量表示饱和度，V分量表示亮度。2. A method for tracking moving objects under complex background and occlusion conditions as claimed in claim 1, wherein the color in step (2) adopts the HSV color model to convert the chroma, saturation, The brightness is distinguished, where the H component represents the chroma, the S component represents the saturation, and the V component represents the brightness.

3.如权利要求2所述的一种复杂背景及遮挡条件下的运动目标跟踪方法，其特征在于，利用所述HSV色彩模型的H分量的直方图，在所述视频图像中得到反向投影图来表示所述目标的所述颜色。3. the moving target tracking method under a kind of complex background and occlusion condition as claimed in claim 2, is characterized in that, utilizes the histogram of the H component of described HSV color model, obtains backprojection in described video image Map to represent the color of the target.

4.如权利要求1所述的一种复杂背景及遮挡条件下的运动目标跟踪方法，其特征在于，步骤（2）中所述纹理的提取采用的所述灰度共生矩阵中的纹理特征是由所述灰度图像每个像素上下左右及其对角线方向的8个邻域之间灰度差值产生的，对所述每个像素分别取45°、135°、90°、0°方向上的邻域灰度值差，即：4. The moving target tracking method under a kind of complex background and occlusion conditions as claimed in claim 1, characterized in that, the texture feature in the gray level co-occurrence matrix used in the texture extraction in step (2) is Generated by the grayscale difference between the upper, lower, left, right, and diagonal directions of each pixel of the grayscale image, 45°, 135°, 90°, and 0° are respectively taken for each pixel Neighborhood gray value difference in the direction, that is:

G1(x,y)=G(x+1,y+1)-G(x-1,y-1)G1(x,y)=G(x+1,y+1)-G(x-1,y-1)

G2(x,y)=G(x-1,y+1)-G(x+1,y-1)G2(x,y)=G(x-1,y+1)-G(x+1,y-1)

G3(x,y)=G(x,y+1)-G(x,y-1)G3(x,y)=G(x,y+1)-G(x,y-1)

G4(x,y)=G(x+1,y)-G(x-1,y)G4(x,y)=G(x+1,y)-G(x-1,y)

其中(x,y)为所述像素坐标。得到4个方向上的邻域灰度值差后，所述纹理的所述每个像素的灰度差值定义为Where (x, y) is the pixel coordinates. After obtaining the neighborhood gray value differences in the four directions, the gray value difference of each pixel of the texture is defined as

G5(x,y)=[G1(x,y)+G2(x,y)+G3(x,y)+G4(x,y)]/4G5(x,y)=[G1(x,y)+G2(x,y)+G3(x,y)+G4(x,y)]/4

由此得到所述纹理。The texture is thus obtained.

5.如权利要求1所述的一种复杂背景及遮挡条件下的运动目标跟踪方法，其特征在于，步骤（3）中所述目标的颜色纹理二维直方图结合所述目标的运动模板信息，得到扩展的反向投影图：5. A method for tracking moving objects under complex background and occlusion conditions according to claim 1, characterized in that the two-dimensional histogram of color texture of the object in step (3) is combined with the moving template information of the object , to get the extended backprojection graph:

\overset{^^}{P P} ((i i,, j j,, k k)) = = \{\begin{matrix} P P ((i i,, j j,, k k)) & M m ((i i,, j j,, k k)) = = 11 \\ 00 & M m ((i i,, j j,, k k)) = = 00 \end{matrix}

其中，P(i,j,k)是所述颜色纹理二维直方图的反向投影图，

是所述扩展的反向投影图，M(i,j,k)为运动模板信息。Wherein, P (i, j, k) is the back projection of the two-dimensional histogram of the color texture,

is the extended back-projection map, and M(i,j,k) is the motion template information.

6.如权利要求1所述的一种复杂背景及遮挡条件下的运动目标跟踪方法，其特征在于，步骤（4）中所述遮挡判定是通过比较所述目标的观测值与最优估计值之间的残差来判断目标是否被遮挡：6. A moving target tracking method under complex background and occlusion conditions according to claim 1, characterized in that the occlusion judgment in step (4) is by comparing the observed value of the target with the optimal estimated value The residual between to judge whether the target is occluded:

r r ((k k)) = = \sqrt{{((x x ((k k)) - - \overset{^^}{x x} ((k k))))}^{22} + + {((y the y ((k k)) - - \overset{^^}{y the y} ((k k))))}^{22}}

式中，x和y分别为所述目标的质心在x轴和y轴方向上的坐标，k表示第k帧，

和

是所述目标的估计值，x(k)和y(k)是所述目标观测值，定义一个阈值α，当r(k)>α时，即判定所述目标发生遮挡，当r(k)<α则判定所述目标未被遮挡。In the formula, x and y are the coordinates of the center of mass of the target in the x-axis and y-axis directions, and k represents the kth frame,

and

is the estimated value of the target, x(k) and y(k) are the observed values of the target, define a threshold α, when r(k)>α, it is determined that the target is occluded, when r(k )<α, it is determined that the target is not blocked.