Background technology
The tracking of moving target is the position of determining interested moving target in each width of cloth image of video image, and same target in the different frame is mapped.Utilize optical sensor detection of a target under complex background, all significant for the civilian and military field, its related sensor design, information processing and system simulation method are the focuses of researcher and engineering technical personnel's research always.
In the process that ground microscope carrier (as portable, fixed) is surveyed surrounding environment, and in the process that microscope carrier (such as aircraft, dirigible, satellite etc.) is surveyed over the ground, often obtain the image of target and background to carry out target detection recognition and tracking or terrain environment acquisition of information and perception by optical detection system in the air.
The visible image capturing technology is along with the development of charge-coupled device (CCD), cmos imaging device and digital image processing techniques, popularization and application is in every field, high definition visible image capturing video alarming system with intelligent image processing function can be identified different objects automatically, find the abnormal conditions in the monitored picture, and can give the alarm and provide useful information, can be applicable to the multiple occasions such as anti-terrorism, accident disposal, aviation supervision, traffic administration, customer behavior analysis.
Middle and later periods in 20th century, visible light camera develops into colour TV camera by B/W camera.Over nearly 20 years, use in a large number colour TV camera in the video monitoring system that safety-security area uses, by the advantage of CCD and the premium properties that reaches at present, colourful CCD video camera becomes the main flow in the video monitoring system.Except Military Application, based on the Video Supervision Technique of visible image capturing to the future development of high definition networking, image digitazation, monitoring intelligent, a new generation has many-sided significant advantage with the CCD monitor and alarm system of intelligent Image Information Processing function, comprising: round-the-clock 24 hours reliable automatic monitorings; Improve the warning degree of accuracy, reduce wrong report and fail to report phenomenon, reduce the gibberish amount; The runs image processing algorithm improves the identification response speed fast; The purposes of effective utilization and extending video resource.
Traditional Mean Shift and CamShift algorithm can be obtained good effect and have good real-time under simple background, thereby are widely used in the video tracking field to moving target.Mean Shift track algorithm is that a kind of probability distribution take the target area pixel value is the track algorithm of feature, because the optimizing fast convergence rate, this algorithm has good real-time, and has certain robustness.In Mean Shift algorithm, the nuclear window plays very important effect.Usually the nuclear window is determined by initial tracking window, and size immobilizes.But in the tracing process to moving target, if target scale changes, especially target scale becomes when surpassing greatly the nuclear window, is easy to cause the tracking for target to lose efficacy.According to this defective of Mean Shift algorithm, G.Bradski has proposed the CamShift algorithm on the basis of Mean Shift algorithm, and this is a kind of improvement algorithm of Mean Shift algorithm.The maximum of it and Mean Shift algorithm is not both in tracing process can self-adaptation adjustment nuclear window size, to reach the variation that adapts to target scale.But no matter be Mean Shift algorithm or CamShift algorithm, their essence all is to use the color probability distribution information of target, the feature that is histogram when following the tracks of, and histogram is a kind of weak feature, and this is easy to cause when just causing having the large tracts of land interference colour in complex background or background follow the tracks of and lost efficacy.And Mean Shift and CamShift algorithm are not done any prediction to target trajectory, in the target fast reserve or block when losing, often can't proceed effective tracking.
Comprehensive, still there are some difficulties in present video tracking technology for motion target tracking, is mainly derived from following aspect: intensity of illumination and Changes in weather are on the impact of target following; Target appearance change or rotate, the variation of the various complexity such as scaling, displacement; The stability problem of following the tracks of during the target rapid movement; The occlusion issue of moving target; The impact of the complicated factors such as background interference; Correct detection and the segmentation problem of moving target; The problem of data fusion of image is followed the tracks of such as multi-cam; The real time problems of following the tracks of etc.
Although more than ten years video frequency object tracking problem has obtained to study widely and obtained significant progress recently, but because the many situations in the actual environment all can affect in the video image the reliable observation to target, therefore design the method that under various complex environments, accurately, fast, stably to follow the tracks of video object, the problem that remains a challenging task and be badly in need of solving especially can be applied under the complex background and the motion target tracking method under the circumstance of occlusion.
Therefore, those skilled in the art is devoted to develop a kind of complex background and blocks motion target tracking method under the condition.
Summary of the invention
In order to realize complex background and block motion target tracking under the condition that the present invention proposes a kind of complex background of the Kalman of combining wave filter and blocks motion target tracking method under the condition.The present invention adopts the color and vein two-dimensional histogram as clarification of objective, can suppress preferably the impact of illumination; On the basis of target back projection figure, in conjunction with the Motion mask information in the background, realize a kind of improved back projection figure, can effectively remove the interference in the background; Adopt the Kalman wave filter that the motion of target is predicted, increase the robustness of tracking, and the tracking accuracy when having improved the target rapid movement; Block when losing in target, with the least square fitting target trajectory and according to target the priori speed method of extrapolating target state is predicted, after target reappears, detect target and proceed and follow the tracks of, effectively solve the target occlusion problem.
For achieving the above object, the present invention utilizes color, texture, movable information and makes up Camshift in conjunction with Kalman filtering, thereby realizes complex background and block motion target tracking method under the condition.Realized the emulation system of method based on Visual Studio2008 and OpenCV2.0.
Said method of the present invention may further comprise the steps:
(1) determines the initial frame of described target and described video image;
(2) obtain the color of described target and the back projection of texture information two-dimensional histogram, described back projection is for calculating the matching degree of histogram distribution situation in entire image of described target region, the region weight more similar to target signature is larger, then the scope that the gray-scale value of each point is zoomed to 0-255 obtains the new gray level image of a width of cloth, is back projection figure; Described color is the image chroma information of described target, utilizes the histogram of described color, obtains the described color that back projection figure represents described target in described video image; Described colouring information is vulnerable to illumination condition impact, combined with texture information when this method changes at illumination condition; Described texture is the gray level image that the image of described target obtains by calculating gray level co-occurrence matrixes, described texture adopts the texture feature extraction in the described gray level co-occurrence matrixes, utilize the histogram of the described texture of target described in the described gray level image, in described video image, obtain back projection figure, represent the described texture of described two field picture;
(3) the color and vein two-dimensional histogram of described target is in conjunction with the Motion mask information of described target, the back projection figure that is expanded, and described Motion mask information is the foreground image that described video image and background model image obtain as difference;
(4) described target is blocked judgement, if being judged to be described target is not blocked, adopt the Kalman wave filter that described target is predicted in the position of next frame, the result of described prediction adopts CamShift algorithm iteration next frame as the starting point of next frame CamShift algorithm iteration; Be blocked if be judged to be described target, described Kalman wave filter quits work, and adopts and carries out trajectory predictions based on the method for least-squares estimation, simultaneously each predicted position is adopted the CamShift algorithm search;
(5) judge to follow the tracks of whether finish, do not finish if be judged to be tracking, obtain the present frame of video tracking, and go to step (2); Finish if be judged to be to follow the tracks of, then method stops.
Wherein, described color adopts the HSV colour model that colourity, saturation degree, brightness are made a distinction, H representation in components colourity wherein, and the S representation in components is saturated, the brightness of V representation in components.
Further, thus described color adopts the histogram of H component and back projection figure thereof to the described color feature of described target.
Wherein, described texture adopts the Gabor filter filtering to extract.
Further, textural characteristics in the described gray level co-occurrence matrixes that the extraction of described texture is adopted is that the gray scale difference value by each pixel 8 neighborhood produces, described each pixel of described gray level image is got respectively 45 °, 135 °, 90 °, the 0 ° neighborhood gray value differences on the direction, that is:
G1(x,y)=G(x+1,y+1)-G(x-1,y-1)                 (1)
G2(x,y)=G(x-1,y+1)-G(x+1,y-1)                 (2)
G3(x,y)=G(x,y+1)-G(x,y-1)                     (3)
G4(x,y)=G(x+1,y)-G(x-1,y)                     (4)
Wherein (x, y) is described pixel coordinate.After obtaining 4 neighborhood gray value differences on the direction, described each Pixel gray difference of described texture is defined as
G5(x,y)=[G1(x,y)+G2(x,y)+G3(x,y)+G4(x,y)]/4   (5)
Obtain thus described texture.
Wherein, described color and vein two-dimensional histogram is by being that the histogram calculation method of one dimension monoplane image is generally expanded under the two-dimensional case of the complex plane that color and texture image form; The back projection of the color and vein two-dimensional histogram of described target is that described color and vein two-dimensional histogram Computation distribution match condition on described color and described texture two dimension complex plane image obtains.
Wherein, the color and vein two-dimensional histogram of described target is in conjunction with the Motion mask information of described target, the back projection figure that is expanded, and described Motion mask information is to make foreground image after the difference through background model in every two field picture of video.
Wherein, the color and vein two-dimensional histogram of described target is in conjunction with the Motion mask information of described target, and the back projection figure that is expanded is defined as follows:
Wherein, P (i, j, k) is the back projection figure of described color and vein two-dimensional histogram,Be the back projection figure of described expansion, M (i, j, k) is Motion mask information.
Wherein, described blocking judges it is to judge by observed reading and the residual error between the optimal estimation value of more described target whether target is blocked:
In the formula, x and y are respectively the coordinate of barycenter on x axle and y direction of principal axis of described target, and k represents the k frame,
With
Be the estimated value of described target, x (k) and y (k) they are described target observation values, define a threshold alpha, as r (k)〉during α, judge that namely described target blocks, and when then judging described target, r (k)<α is not blocked.
Wherein, in the described Kalman filtering described target is adopted the uniform motion model, the state vector that defines described target is X (k)=[x (k), y (k), vx(k), vy(k)], observation vector is Z (k)=[vx(k), vy(k)], x (k) wherein, y (k), vx(k), vy(k) be respectively coordinate and the speed of barycenter on x axle and y direction of principal axis of described target, andState-transition matrix Φ (k) is:
Observing matrix H (k) is:
Wherein Δ t represents the time interval of t (k) and t (k-1), can represent that with frame number is poor Δ t=1 makes that dynamic noise covariance matrix Q (k), observation noise covariance matrix R (k) are unit matrix; The barycenter of the described target that use detects and window be as the initial input of CamShift algorithm, and the described state vector X of initialization (k) simultaneously, x (k) wherein, y (k) initial value are the centroid position of the described target that detects, v
x(k), v
y(k) be initialized as 0, in the tracing process, at the k frame, from the optimal estimation of the centroid position of the described target of previous frame
Obtain the predicted value of x (k)
Adopt simultaneously the CamShift algorithm to calculate the barycenter of described target, and revise described predicted value as observed reading z (k)
Obtain the optimal estimation of described x (k)
Then calculate described target barycenter in the next frame predicted value
With wherein the positional information starting point as the search of next frame CamShift algorithm target centroid position.
Method of the present invention can be based on the emulation system of Visual Studio2008 and OpenCV2.0 realization this method.
The present invention is based on the comprehensive characteristics of color, texture and target travel information for improvement of the CamShift algorithm, and in conjunction with the Kalman wave filter target state is predicted, improved tracking stability and the tracking accuracy of mobile target in complex background.Colouring information is vulnerable to the factors such as illumination and background interference look and disturbs, the defective when introducing target texture feature improvement employing solid color information, and continue to add target travel information, further get rid of the interference in the background.When target is blocked, carry out the extrapolation of least square fitting and target trajectory by the prior imformation before the target occlusion, target of prediction movement position information is conducive to block when finishing the recapture to moving target.By the message complementary sense between many features and to the real-time estimate of target, the method in complex background and target by under the circumstance of occlusion in short-term, can realize target continue, tenacious tracking.
Be described further below with reference to the technique effect of accompanying drawing to design of the present invention, concrete structure and generation, to understand fully purpose of the present invention, feature and effect.
Embodiment
As shown in Figure 1, in the present embodiment, said method of the present invention may further comprise the steps:
The first step is determined the initial frame of tracking target and corresponding video tracking.
Second step, obtain the color of described target and the back projection of texture information two-dimensional histogram, described back projection is for calculating the matching degree of histogram distribution situation in entire image of described target region, the region weight more similar to target signature is larger, then the scope that the gray-scale value of each point is zoomed to 0-255 obtains the new gray level image of a width of cloth, is back projection figure; Described color is the image chroma information of described target, utilizes the histogram of described color, obtains the described color that back projection figure represents described target in described video image; Described colouring information is vulnerable to illumination condition impact, combined with texture information when this method changes at illumination condition; Described texture is the gray level image that the image of described target obtains by calculating gray level co-occurrence matrixes, described texture adopts the texture feature extraction in the described gray level co-occurrence matrixes, utilize the histogram of the described texture of target described in the described gray level image, in described video image, obtain back projection figure, represent the described texture of described two field picture.
In the present embodiment, colouring information adopts the HSV colour model that colourity, saturation degree, brightness are made a distinction, H representation in components colourity wherein, and the S representation in components is saturated, the brightness of V representation in components.More preferably, because the H component has comprised the colouring information of object, so just can obtain color feature to target with the histogram of H component and back projection thereof.
In the present embodiment, texture is a kind of feature very important in the graphical analysis, is different from the characteristics of image such as color and edge, himself has the good properties such as very strong anti-illuminance abrupt variation characteristic and local sequentiality.Present embodiment considers performance and computational complexity, and the gray scale symbiosis square textural characteristics of employing is to be produced by the gray scale difference value of each pixel 8 neighborhood.Each pixel of gray level image is got respectively 45 °, 135 °, 90 °, the 0 ° neighborhood gray value differences on the direction, that is:
G1(x,y)=G(x+1,y+1)-G(x-1,y-1)                  (1)
G2(x,y)=G(x-1,y+1)-G(x+1,y-1)                  (2)
G3(x,y)=G(x,y+1)-G(x,y-1)                      (3)
G4(x,y)=G(x+1,y)-G(x-1,y)                      (4)
After obtaining 4 neighborhood gray value differences on the direction, each pixel definition of texture image is
G5(x,y)=[G1(x,y)+G2(x,y)+G3(x,y)+G4(x,y)]/4    (5)
Obtain thus texture image, the grey level histogram of calculating target and back projection thereof are as the texture information of target.
In the 3rd step, the color and vein two-dimensional histogram of described target is in conjunction with the Motion mask information of described target, the back projection figure that is expanded, and described Motion mask information is through making foreground image after the difference with the background model image in every two field picture of video.
In the present embodiment, to color and the texture information fusion of described target, be combined into two-dimensional histogram by color and texture one dimension histogram expansion separately and obtain.And at H component and its back projection of texture two passage images calculating figure.
By the extraction of moving target in each two field picture, can effectively get rid of the interference in the background, dwindle the scope of search and improved tracking accuracy.At first calculate the foreground target in the scene, i.e. Motion mask.For the noise in the removal of images and the higher background authenticity of acquisition, thereby the video image information of certain hour length is weighted the initial background that on average obtains with the real background image approximate.When image sequence passed through this time domain low-pass filter, the gradual part of image sequence can be separated from the quick change procedure of image, and the method for building up of image background model is as follows:
B(i,j,k+1)=B(i,j,k)+g(k)·(I(i,j,k)-B(i,j,k))          (6)
g(k)=β·(1-M(i,j,k))+α·M(i,j,k)                     (7)
In the formula, the pixel value of I (i, j, k) expression k two field picture (i, j) coordinate, B (i, j, k) is the present frame background image, B (i, j, k+1) be the next frame background image, β is background gactor, and α is molility factor, and α and β are between 0 and 1.They have determined the adaptive process in the context update, and what namely motion parts was more is updated in the prospect, and is judged as more being updated in the background of non-motion parts.Avg1, σ1Average and the variance of present image and background image error image, avg2, σ2Average and the variance of present image, the foreground image of M (i, j, k) expression binaryzation, Th is binary-state threshold.This formula can be interpreted as a kind of recurrence prediction of background, and it calculates next background constantly by the real-time update of background gactor and molility factor.With the foreground target that detects, as the Motion mask information in the scene, namely M (i, j, k) value can judge whether a certain pixel belongs to moving target.
In conjunction with the Motion mask in the scene, just can calculate final back projection figure, be defined as follows:
Wherein, P (i, j, k) is the back projection figure of color of object-textural characteristics,
Be the back projection figure after the adding Motion mask information, M (i, j, k) is Motion mask information.
The 4th step, described target is blocked judgement, if be judged to be true, be that described target is not blocked, adopt the Kalman wave filter that described target is predicted in the position of next frame, the result of described prediction adopts CamShift algorithm iteration next frame as the starting point of next frame CamShift algorithm iteration; If be judged to be vacation, namely described target is blocked, and when occurring blocking, described Kalman wave filter quits work, and adopts and carries out trajectory predictions based on the method for least-squares estimation, simultaneously each predicted position is adopted the CamShift algorithm search.
When judging that target is blocked, namely can be considered target generation deformation, if the centroid position that continue to use the CamShift algorithm to calculate target this moment is not its actual position, the Kalman wave filter that is formed by this location point information the observed reading of present frame neither be correct observed reading.So present embodiment judges by observed reading and the residual error between the optimal estimation value of comparison object whether target is blocked:
In the formula,
With
Be the estimated value of target, x (k) and y (k) are the target observation values.Select a threshold alpha, as r (k)〉during α, namely target is blocked, when r (k)<α then target be not blocked.
When described target is not blocked, adopt the Kalman wave filter that described target is predicted in the position of next frame, the result of described prediction adopts CamShift algorithm iteration next frame as the starting point of next frame CamShift algorithm iteration.Traditional CamShift algorithm directly uses the target centroid position of present frame as the starting point of next frame iteration, and any prediction is not done in target travel, so track algorithm was lost efficacy.Adopt the color-texture union feature, although and can get rid of interference in the background in conjunction with movable information, if there be the object close with target signature in the prospect, still can produce interference to tracking results.For above problem, introduce
The Kalman wave filter is estimated in the position of next frame the present frame target, and as the starting point of next frame CamShift algorithm iteration, can effectively overcome the above problems.
In the target following process, because the adjacent two frame period times are shorter, can during this period of time regard target as uniform motion in the present embodiment, so adopt the uniform motion model.
The state vector that makes target is X (k)=[x (k), y (k), vx(k), vy(k)], observation vector is Z (k)=[vx(k), vy(k)], x (k) wherein, y (k), vx(k), vy(k) be respectively coordinate and the speed of target barycenter on x axle and y direction of principal axis,Therefore, the state-transition matrix Φ (k) of system is:
Observing matrix H (k) is:
Wherein Δ t represents the time interval of t (k) and t (k-1), can represent with frame number is poor, at this moment Δ t=1.Can make that dynamic noise covariance matrix Q (k), observation noise covariance matrix R (k) are unit matrix.
In the experiment, use the barycenter detect moving target and window as the initial input of CamShift algorithm, init state vector X (k) simultaneously, x (k) wherein, y (k) is initialized as the centroid position that detects target, v
x(k), v
y(k) all be initialized as 0.In the tracing process, at k constantly, from the upper one constantly optimal estimation of target centroid position
Obtain the predicted value of x (k)
The CamShift algorithm calculates the barycenter of target simultaneously, and revises predicted value as observed reading z (k)
Obtain the optimal estimation of x (k)
Then will
In the next frame predicted value
In centroid position as the input of next frame CamShift algorithm.In each frame, utilize Kalman filtering that the target centroid position is estimated like this, improve tracking effect.
If be judged to be vacation, namely described target is blocked, and when occurring blocking, described Kalman wave filter quits work, and adopts and carries out trajectory predictions based on the method for least-squares estimation, simultaneously each predicted position is adopted the CamShift algorithm search.
When occurring blocking, the Kalman wave filter quits work, tracking is converted into based on the method for least-squares estimation carries out trajectory predictions, whether the search target occurs near each predicted position simultaneously, if decision criteria judges that target still is in the state of being blocked, then continue to do trajectory predictions, if the unshielding state, then recover CamShift and Kalman filter tracks, and upgrade the Kalman filter status with new observed reading.
For time series { x1, x2, x3... xn, i=1,2...n, { xnBe the desired value of prediction, { xN-1, xN-2, xN-3... xN-mBe known correlative, and carry out fitting a straight line with least square method, namely obtain straight line and make { xN-1, xN-2, xN-3... xN-mError sum of squares minimum.To blocking front m frame target location { xN-1, xN-2, xN-3... xN-mCarry out least square fitting, can obtain the linear regression model (LRM) of target location.
Can set up being predicted as of the following first step thus
Being predicted as of following second step
And finish by that analogy when blocking prediction to the target location.
At the target in video image tracking of present embodiment, the CamShift algorithm flow is as follows:
(1) video image is converted into the HSV space from rgb color space;
(2) objective definition calculates the histogram of this zone H component in the prime area of tracking initiation frame;
(3) calculate this histogram at the probability distribution image of each frame, i.e. intensity profile matching degree obtain the back projection figure of each frame, and the pixel value that back projection figure is had a few zooms in [0,255] scope;
(4) search for the Camshift algorithm at each frame back projection figure, find the zone of mating the most with the initial target histogram distribution, this zone is the target following position;
(5) target location that searches with present frame is as the starting point of next frame Camshift algorithm search.
The 5th step, judge to follow the tracks of and whether finish, if be judged to be very, namely follow the tracks of and finish, obtain the present frame of video tracking, and go to second step; If be judged to be vacation, namely follow the tracks of and finish, then method stops.
Further, method of the present invention can be based on the emulation system of Visual Studio2008 and OpenCV2.0 realization this method.
More than describe preferred embodiment of the present invention in detail.The ordinary skill that should be appreciated that this area need not creative work and just can design according to the present invention make many modifications and variations.Therefore, all in the art technician all should be in the determined protection domain by claims under this invention's idea on the basis of existing technology by the available technical scheme of logical analysis, reasoning, or a limited experiment.