A kind of preprocessing method of video signal of artificial intelligence multi-mode behavior recognition and descriptionTechnical field
The present invention relates to a kind of disposal route of vision signal, especially relate to a kind of preprocessing method of video signal of artificial intelligence multi-mode behavior recognition and description.
Background technology
Present intelligent video-detect technology can detect moving target well, but for the classification of target, still exists certain defect, relatively is difficult to distinguish type of vehicle, the problem of the aspects such as pedestrian.
Summary of the invention
The object of the invention is to overcome the prior art deficiency, by the moving target that detected early stage, classifies, and target type can be distinguished, and realizes artificial intelligence multi-mode behavior recognition and description.
Solving the problems of the technologies described above the technical scheme that adopts is a kind of preprocessing method of video signal of artificial intelligence multi-mode behavior recognition and description, it is characterized in that: the foreground extraction mode that has adopted HOG-LBP (gradient orientation histogram and local binary patterns), realize the separation of people's car, preprocess method and algorithm are as follows:
One, by the realization of HOG, described HOG its implementation is first image to be divided into the little grid unit connected region that is called; Then gather gradient direction or the edge orientation histogram of each pixel in the grid unit; Finally altogether just can the constitutive characteristic descriptor these set of histograms, degree of the comparing normalization (contrast-normalized) in the larger interval of image (block) of these local histograms, the method is by first calculating the density of each histogram in this interval (block), then according to this density value, normalization is done in each grid unit in interval, after this normalization, can obtain better stability to illumination variation and shade.
Histograms of oriented gradients (HOG) descriptor, have and can keep good unchangeability to image geometry with deformation optics, and these two kinds of deformation only there will be on larger space field; Have under the sampling of thick spatial domain, meticulous direction sampling and stronger conditions such as indicative of local optical normalization, as long as the posture that the pedestrian can be kept upright substantially, can allow that the pedestrian has some trickle limb actions, these trickle actions can be left in the basket and not affect the detection effect, and adopting the histograms of oriented gradients method is to be particularly suitable for doing pedestrian detection in image.
(1) gradient calculation, the HOG descriptor first step is exactly the compute gradient value, computing method: the discrete gradient masterplate of simply applying an one dimension is applied in respectively the horizontal and vertical direction and gets on, accompanyingdrawing 1 expression be that the gradient of horizontal and vertical direction represents, can use following convolution kernel to carry out convolution:
[-1.0.1]and[-1.0.1]T.
(2) the direction dividing elements of statistics with histogram (Orientation binning), the second step that calculates is to set up blocked histogram, each pixel in each piece is voted to the direction histogram, the shape of each piece can be rectangle or circle, the direction value of direction histogram is the 0-180 degree, direction is divided into 9 channels result best, what accompanyingdrawing 2 represented is the direction dividing elements of statistics with histogram.
As for the weight of ballot, can be the amplitude of gradient itself or its function, through actual test, gradient amplitude itself can produce best result.
(3) descriptor block, in order to explain the change of illumination and contrast, gradient intensity is normalization partly, this need to be combined into grid larger, the block that spatially links, the HOG descriptor is the vector of the histogrammic element of normalization grid, this histogram is by the zone of all blocks, these blocks usually can be overlapping, mean that each grid has affected last descriptor more than once, two main blocks exist for how much: one is the R-HOG block of rectangle, another is circular C-HOG block, in general the R-HOG block is that a plurality of grids are molecular, by three Parametric Representations: how many grids each block has, each grid has several pixels, and each grid histogram has How many channels do you, the present invention has adopted the R-HOG block, and obtaining by experiment optimum cell block division is 3x3 or 6x6 pixel, histogram is 9 passages simultaneously.
Two, the realization by LBP, described LBP principle is that central pixel point and the pixel around it are carried out size relatively, thereby obtained the sequence of a binaryzation, be converted to the modal representation of metric numerical value by the sequence with binaryzation, then the pattern of each pixel is added up to sort out and obtain a histogram, then the one-component in the corresponding histogram of each pattern, this histogram is used for follow-up identification mission as effective description of former figure.
Basic LBP operator is described as follows:
Can use LBPP, RThe LBP operator that represents arbitrary dimension, wherein (P, R) span decentering point radius is P sample point on the circumference of R, can realize the calculating of the LBP surrounding pixel point value of any P and R by bilinear interpolation, according to altogether 2pThe occurrence number of individual different mode, can obtain their LBP histogram, and accompanyingdrawing 3 has represented the implementation procedure of LBP.
If getting the peripheral information that any one pixel in image obtains it to compare more than or equal to this pixel with this pixel by a threshold value, be designated as 1, otherwise be designated as 0, obtain a binary value by a clockwise arrangement, obtain a decimal number by conversion, decimal numeral pattern for different, obtain different histograms.In order to reduce histogrammic dimension, the present invention has adopted More General Form (uniform pattern), and the LBP pattern is redefined as LBPN, ruChoose n pixel in the scope of radius r, the conversion between 0-1 can not be greater than u, and such pattern is exactly More General Form (uniform pattern).
Be the foreground extraction mode due to what adopt, whether so need to first train, then obtaining information compares training result in image, thereby obtain this target, be people or car.
Three, based on the training of HOG-LBP mode, the training of gradient orientation histogram and local binary patterns method (HOG-LBP) realizes, flow process is as follows:
Step1: obtain training image (comprising positive sample and negative sample);
Step2: the gradient orientation histogram (HOG) that calculates this training image:
1, gradient calculation,
2, the direction dividing elements of statistics with histogram (Orientation binning),
3, descriptor block,
Step3: the local binary patterns (LBP) that calculates this training image:
1, choose More General Form LBP8,12,
2, the LBP pattern of calculating pixel periphery, be converted to metric pattern,
3, statistics LBP histogram,
Step4: the histogram simultaneous of HOG and LBP is got up, form a training histogram;
Step5: the training histogram that will calculate is put into the svm classifier device and is trained, thereby obtains the Classification and Identification data.
Four, based on the realization of HOG-LBP mode, the identification of gradient orientation histogram and local binary patterns method (HOG-LBP) realizes, flow process is as follows:
Step1: obtain the image that needs identification;
Step2: the gradient orientation histogram (HOG) that calculates this training image:
1, gradient calculation,
2, the direction dividing elements of statistics with histogram (Orientation binning),
3, descriptor block,
Step3: the local binary patterns (LBP) that calculates this training image:
1, choose More General Form LBP8,12,
2, the LBP pattern of calculating pixel periphery, be converted to metric pattern,
3, statistics LBP histogram,
Step4: the histogram simultaneous of HOG and LBP is got up, form an identification histogram;
Step5: the recognition data that the identification histogram that will calculate and training stage draw carries out convolution, thereby obtains the output of a result;
Step6: if this is output as 1, this shows it is the people; If be output as 0, be indicated as vehicle; If output is-1, this shows it is other object, is defined as unrestrained thing here.
Like this,, by above method, realize detecting the classification of moving target, make target type can distinguish (people, vehicle, perhaps unrestrained thing), realize artificial intelligence multi-mode behavior recognition and description.
The invention has the beneficial effects as follows: can distinguish well the type of each target, thereby for the tracking of target, better meaning be arranged.Can remain Useful Information, reject otiose information, thereby reduce useless information.
Description of drawings
Fig. 1 vertical gradient and horizontal direction gradient represent;
The direction dividing elements of Fig. 2 statistics with histogram;
The implementation procedure of Fig. 3 LBP.
Embodiment
Below in conjunction with accompanyingdrawing 1, accompanyingdrawing 2, accompanyingdrawing 3 and an embodiment, the present invention is described in further detail:
A kind of preprocessing method of video signal of artificial intelligence multi-mode behavior recognition and description, it is characterized in that: the foreground extraction mode that has adopted HOG-LBP (gradient orientation histogram and local binary patterns), realize the separation of people's car, preprocess method and algorithm are as follows:
One, by the realization of HOG, described HOG its implementation is first image to be divided into the little grid unit connected region that is called; Then gather gradient direction or the edge orientation histogram of each pixel in the grid unit; Finally altogether just can the constitutive characteristic descriptor these set of histograms, degree of the comparing normalization (contrast-normalized) in the larger interval of image (block) of these local histograms, the method is by first calculating the density of each histogram in this interval (block), then according to this density value, normalization is done in each grid unit in interval, after this normalization, can obtain better stability to illumination variation and shade.
Histograms of oriented gradients (HOG) descriptor, have and can keep good unchangeability to image geometry with deformation optics, and these two kinds of deformation only there will be on larger space field; Have under the sampling of thick spatial domain, meticulous direction sampling and stronger conditions such as indicative of local optical normalization, as long as the posture that the pedestrian can be kept upright substantially, can allow that the pedestrian has some trickle limb actions, these trickle actions can be left in the basket and not affect the detection effect, and adopting the histograms of oriented gradients method is to be particularly suitable for doing pedestrian detection in image.
(1) gradient calculation, the HOG descriptor first step is exactly the compute gradient value, computing method: the discrete gradient masterplate of simply applying an one dimension is applied in respectively the horizontal and vertical direction and gets on, accompanyingdrawing 1 expression be that the gradient of horizontal and vertical direction represents, can use following convolution kernel to carry out convolution:
[-1.01.1]and[-1.0.1]T.
(2) the direction dividing elements of statistics with histogram (Orientation binning), the second step that calculates is to set up blocked histogram, each pixel in each piece is voted to the direction histogram, the shape of each piece can be rectangle or circle, the direction value of direction histogram is the 0-180 degree, direction is divided into 9 channels result best, what accompanyingdrawing 2 represented is the direction dividing elements of statistics with histogram.
As for the weight of ballot, can be the amplitude of gradient itself or its function, through actual test, gradient amplitude itself can produce best result.
(3) descriptor block, in order to explain the change of illumination and contrast, gradient intensity is normalization partly, this need to be combined into grid larger, the block that spatially links, the HOG descriptor is the vector of the histogrammic element of normalization grid, this histogram is by the zone of all blocks, these blocks usually can be overlapping, mean that each grid has affected last descriptor more than once, two main blocks exist for how much: one is the R-HOG block of rectangle, another is circular C-HOG block, in general the R-HOG block is that a plurality of grids are molecular, by three Parametric Representations: how many grids each block has, each grid has several pixels, and each grid histogram has How many channels do you, the present invention has adopted the R-HOG block, and obtaining by experiment optimum cell block division is 3x3 or 6x6 pixel, histogram is 9 passages simultaneously.
Two, the realization by LBP, described LBP principle is that central pixel point and the pixel around it are carried out size relatively, thereby obtained the sequence of a binaryzation, be converted to the modal representation of metric numerical value by the sequence with binaryzation, then the pattern of each pixel is added up to sort out and obtain a histogram, then the one-component in the corresponding histogram of each pattern, this histogram is used for follow-up identification mission as effective description of former figure.
Basic LBP operator is described as follows:
Can use LBPP, RThe LBP operator that represents arbitrary dimension, wherein (P, R) span decentering point radius is P sample point on the circumference of R, can realize the calculating of the LBP surrounding pixel point value of any P and R by bilinear interpolation, according to altogether 2pThe occurrence number of individual different mode, can obtain their LBP histogram, and accompanyingdrawing 3 has represented the implementation procedure of LBP.
If getting the peripheral information that any one pixel in image obtains it to compare more than or equal to this pixel with this pixel by a threshold value, be designated as 1, otherwise be designated as 0, obtain a binary value by a clockwise arrangement, obtain a decimal number by conversion, decimal numeral pattern for different, obtain different histograms.
In order to reduce histogrammic dimension, the present invention has adopted More General Form (uniform pattern), and the LBP pattern is redefined as LBPN, ruChoose n pixel in the scope of radius r, the conversion between 0-1 can not be greater than u.Such pattern is exactly More General Form (uniform pattern).
Be the foreground extraction mode due to what adopt, whether so need to first train, then obtaining information compares training result in image, thereby obtain this target, be people or car.
Three, based on the training of HOG-LBP mode, the training of gradient orientation histogram and local binary patterns method (HOG-LBP) realizes, flow process is as follows:
Step1: obtain training image (comprising positive sample and negative sample);
Step2: the gradient orientation histogram (HOG) that calculates this training image:
1, gradient calculation,
2, the direction dividing elements of statistics with histogram (Orientation binning),
3, descriptor block.
Step3: the local binary patterns (LBP) that calculates this training image:
1, choose More General Form LBP8,12,
2, the LBP pattern of calculating pixel periphery, be converted to metric pattern
3, statistics LBP histogram.
Step4: the histogram simultaneous of HOG and LBP is got up, form a training histogram;
Step5: the training histogram that will calculate is put into the svm classifier device and is trained, thereby obtains the Classification and Identification data.
Four, based on the realization of HOG-LBP mode, the identification of gradient orientation histogram and local binary patterns method (HOG-LBP) realizes, flow process is as follows:
Step1: obtain the image that needs identification;
Step2: the gradient orientation histogram (HOG) that calculates this training image:
1, gradient calculation,
2, the direction dividing elements of statistics with histogram (Orientation binning),
3, descriptor block.
Step3: the local binary patterns (LBP) that calculates this training image:
1, choose More General Form LBP8,12,
2, the LBP pattern of calculating pixel periphery, be converted to metric pattern,
3, statistics LBP histogram.
Step4: the histogram simultaneous of HOG and LBP is got up, form an identification histogram;
StepS: the recognition data that the identification histogram that will calculate and training stage draw carries out convolution, thereby obtains the output of a result;
Step6: if this is output as 1, this shows it is the people; If be output as 0, be indicated as vehicle; If output is-1, this shows it is other object, is defined as unrestrained thing here.
Like this,, by above method, realize detecting the classification of moving target, make target type can distinguish (people, vehicle, perhaps unrestrained thing), realize that artificial intelligence multi-mode behavior knowledge is another and describe.