Movatterモバイル変換


[0]ホーム

URL:


CN103400138A - Video signal preprocessing method for artificial intelligent multimode behavior recognition and description - Google Patents

Video signal preprocessing method for artificial intelligent multimode behavior recognition and description
Download PDF

Info

Publication number
CN103400138A
CN103400138ACN2013103342699ACN201310334269ACN103400138ACN 103400138 ACN103400138 ACN 103400138ACN 2013103342699 ACN2013103342699 ACN 2013103342699ACN 201310334269 ACN201310334269 ACN 201310334269ACN 103400138 ACN103400138 ACN 103400138A
Authority
CN
China
Prior art keywords
histogram
lbp
hog
video signal
behavior recognition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2013103342699A
Other languages
Chinese (zh)
Inventor
沈玉琴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by IndividualfiledCriticalIndividual
Priority to CN2013103342699ApriorityCriticalpatent/CN103400138A/en
Publication of CN103400138ApublicationCriticalpatent/CN103400138A/en
Pendinglegal-statusCriticalCurrent

Links

Images

Landscapes

Abstract

The invention relates to a video signal preprocessing method for artificial intelligent multimode behavior recognition and description. By classifying moving objects which are detected at an early stage, the types of the objects can be distinguished and the artificial intelligent multimode behavior recognition and description is realized. The video signal preprocessing method artificial intelligent multimode behavior recognition and description is characterized in that an HOG-LBP (Histogram of Oriented Gradients and Local Binary Patterns) foreground extraction method is adopted and a preprocessing method and a preprocessing algorithm are realized. The video signal preprocessing method for artificial intelligent multimode behavior recognition and description has the beneficial effects that the types of the objects can be well distinguished, a better significance to the tracking of the objects is obtained, usable information can be reserved, unusable information can be removed and the unusable information is reduced.

Description

A kind of preprocessing method of video signal of artificial intelligence multi-mode behavior recognition and description
Technical field
The present invention relates to a kind of disposal route of vision signal, especially relate to a kind of preprocessing method of video signal of artificial intelligence multi-mode behavior recognition and description.
Background technology
Present intelligent video-detect technology can detect moving target well, but for the classification of target, still exists certain defect, relatively is difficult to distinguish type of vehicle, the problem of the aspects such as pedestrian.
Summary of the invention
The object of the invention is to overcome the prior art deficiency, by the moving target that detected early stage, classifies, and target type can be distinguished, and realizes artificial intelligence multi-mode behavior recognition and description.
Solving the problems of the technologies described above the technical scheme that adopts is a kind of preprocessing method of video signal of artificial intelligence multi-mode behavior recognition and description, it is characterized in that: the foreground extraction mode that has adopted HOG-LBP (gradient orientation histogram and local binary patterns), realize the separation of people's car, preprocess method and algorithm are as follows:
One, by the realization of HOG, described HOG its implementation is first image to be divided into the little grid unit connected region that is called; Then gather gradient direction or the edge orientation histogram of each pixel in the grid unit; Finally altogether just can the constitutive characteristic descriptor these set of histograms, degree of the comparing normalization (contrast-normalized) in the larger interval of image (block) of these local histograms, the method is by first calculating the density of each histogram in this interval (block), then according to this density value, normalization is done in each grid unit in interval, after this normalization, can obtain better stability to illumination variation and shade.
Histograms of oriented gradients (HOG) descriptor, have and can keep good unchangeability to image geometry with deformation optics, and these two kinds of deformation only there will be on larger space field; Have under the sampling of thick spatial domain, meticulous direction sampling and stronger conditions such as indicative of local optical normalization, as long as the posture that the pedestrian can be kept upright substantially, can allow that the pedestrian has some trickle limb actions, these trickle actions can be left in the basket and not affect the detection effect, and adopting the histograms of oriented gradients method is to be particularly suitable for doing pedestrian detection in image.
(1) gradient calculation, the HOG descriptor first step is exactly the compute gradient value, computing method: the discrete gradient masterplate of simply applying an one dimension is applied in respectively the horizontal and vertical direction and gets on, accompanyingdrawing 1 expression be that the gradient of horizontal and vertical direction represents, can use following convolution kernel to carry out convolution:
[-1.0.1]and[-1.0.1]T.
(2) the direction dividing elements of statistics with histogram (Orientation binning), the second step that calculates is to set up blocked histogram, each pixel in each piece is voted to the direction histogram, the shape of each piece can be rectangle or circle, the direction value of direction histogram is the 0-180 degree, direction is divided into 9 channels result best, what accompanyingdrawing 2 represented is the direction dividing elements of statistics with histogram.
As for the weight of ballot, can be the amplitude of gradient itself or its function, through actual test, gradient amplitude itself can produce best result.
(3) descriptor block, in order to explain the change of illumination and contrast, gradient intensity is normalization partly, this need to be combined into grid larger, the block that spatially links, the HOG descriptor is the vector of the histogrammic element of normalization grid, this histogram is by the zone of all blocks, these blocks usually can be overlapping, mean that each grid has affected last descriptor more than once, two main blocks exist for how much: one is the R-HOG block of rectangle, another is circular C-HOG block, in general the R-HOG block is that a plurality of grids are molecular, by three Parametric Representations: how many grids each block has, each grid has several pixels, and each grid histogram has How many channels do you, the present invention has adopted the R-HOG block, and obtaining by experiment optimum cell block division is 3x3 or 6x6 pixel, histogram is 9 passages simultaneously.
Two, the realization by LBP, described LBP principle is that central pixel point and the pixel around it are carried out size relatively, thereby obtained the sequence of a binaryzation, be converted to the modal representation of metric numerical value by the sequence with binaryzation, then the pattern of each pixel is added up to sort out and obtain a histogram, then the one-component in the corresponding histogram of each pattern, this histogram is used for follow-up identification mission as effective description of former figure.
Basic LBP operator is described as follows:
Can use LBPP, RThe LBP operator that represents arbitrary dimension, wherein (P, R) span decentering point radius is P sample point on the circumference of R, can realize the calculating of the LBP surrounding pixel point value of any P and R by bilinear interpolation, according to altogether 2pThe occurrence number of individual different mode, can obtain their LBP histogram, and accompanyingdrawing 3 has represented the implementation procedure of LBP.
If getting the peripheral information that any one pixel in image obtains it to compare more than or equal to this pixel with this pixel by a threshold value, be designated as 1, otherwise be designated as 0, obtain a binary value by a clockwise arrangement, obtain a decimal number by conversion, decimal numeral pattern for different, obtain different histograms.In order to reduce histogrammic dimension, the present invention has adopted More General Form (uniform pattern), and the LBP pattern is redefined as LBPN, ruChoose n pixel in the scope of radius r, the conversion between 0-1 can not be greater than u, and such pattern is exactly More General Form (uniform pattern).
Be the foreground extraction mode due to what adopt, whether so need to first train, then obtaining information compares training result in image, thereby obtain this target, be people or car.
Three, based on the training of HOG-LBP mode, the training of gradient orientation histogram and local binary patterns method (HOG-LBP) realizes, flow process is as follows:
Step1: obtain training image (comprising positive sample and negative sample);
Step2: the gradient orientation histogram (HOG) that calculates this training image:
1, gradient calculation,
2, the direction dividing elements of statistics with histogram (Orientation binning),
3, descriptor block,
Step3: the local binary patterns (LBP) that calculates this training image:
1, choose More General Form LBP8,12,
2, the LBP pattern of calculating pixel periphery, be converted to metric pattern,
3, statistics LBP histogram,
Step4: the histogram simultaneous of HOG and LBP is got up, form a training histogram;
Step5: the training histogram that will calculate is put into the svm classifier device and is trained, thereby obtains the Classification and Identification data.
Four, based on the realization of HOG-LBP mode, the identification of gradient orientation histogram and local binary patterns method (HOG-LBP) realizes, flow process is as follows:
Step1: obtain the image that needs identification;
Step2: the gradient orientation histogram (HOG) that calculates this training image:
1, gradient calculation,
2, the direction dividing elements of statistics with histogram (Orientation binning),
3, descriptor block,
Step3: the local binary patterns (LBP) that calculates this training image:
1, choose More General Form LBP8,12,
2, the LBP pattern of calculating pixel periphery, be converted to metric pattern,
3, statistics LBP histogram,
Step4: the histogram simultaneous of HOG and LBP is got up, form an identification histogram;
Step5: the recognition data that the identification histogram that will calculate and training stage draw carries out convolution, thereby obtains the output of a result;
Step6: if this is output as 1, this shows it is the people; If be output as 0, be indicated as vehicle; If output is-1, this shows it is other object, is defined as unrestrained thing here.
Like this,, by above method, realize detecting the classification of moving target, make target type can distinguish (people, vehicle, perhaps unrestrained thing), realize artificial intelligence multi-mode behavior recognition and description.
The invention has the beneficial effects as follows: can distinguish well the type of each target, thereby for the tracking of target, better meaning be arranged.Can remain Useful Information, reject otiose information, thereby reduce useless information.
Description of drawings
Fig. 1 vertical gradient and horizontal direction gradient represent;
The direction dividing elements of Fig. 2 statistics with histogram;
The implementation procedure of Fig. 3 LBP.
Embodiment
Below in conjunction with accompanyingdrawing 1, accompanyingdrawing 2, accompanyingdrawing 3 and an embodiment, the present invention is described in further detail:
A kind of preprocessing method of video signal of artificial intelligence multi-mode behavior recognition and description, it is characterized in that: the foreground extraction mode that has adopted HOG-LBP (gradient orientation histogram and local binary patterns), realize the separation of people's car, preprocess method and algorithm are as follows:
One, by the realization of HOG, described HOG its implementation is first image to be divided into the little grid unit connected region that is called; Then gather gradient direction or the edge orientation histogram of each pixel in the grid unit; Finally altogether just can the constitutive characteristic descriptor these set of histograms, degree of the comparing normalization (contrast-normalized) in the larger interval of image (block) of these local histograms, the method is by first calculating the density of each histogram in this interval (block), then according to this density value, normalization is done in each grid unit in interval, after this normalization, can obtain better stability to illumination variation and shade.
Histograms of oriented gradients (HOG) descriptor, have and can keep good unchangeability to image geometry with deformation optics, and these two kinds of deformation only there will be on larger space field; Have under the sampling of thick spatial domain, meticulous direction sampling and stronger conditions such as indicative of local optical normalization, as long as the posture that the pedestrian can be kept upright substantially, can allow that the pedestrian has some trickle limb actions, these trickle actions can be left in the basket and not affect the detection effect, and adopting the histograms of oriented gradients method is to be particularly suitable for doing pedestrian detection in image.
(1) gradient calculation, the HOG descriptor first step is exactly the compute gradient value, computing method: the discrete gradient masterplate of simply applying an one dimension is applied in respectively the horizontal and vertical direction and gets on, accompanyingdrawing 1 expression be that the gradient of horizontal and vertical direction represents, can use following convolution kernel to carry out convolution:
[-1.01.1]and[-1.0.1]T.
(2) the direction dividing elements of statistics with histogram (Orientation binning), the second step that calculates is to set up blocked histogram, each pixel in each piece is voted to the direction histogram, the shape of each piece can be rectangle or circle, the direction value of direction histogram is the 0-180 degree, direction is divided into 9 channels result best, what accompanyingdrawing 2 represented is the direction dividing elements of statistics with histogram.
As for the weight of ballot, can be the amplitude of gradient itself or its function, through actual test, gradient amplitude itself can produce best result.
(3) descriptor block, in order to explain the change of illumination and contrast, gradient intensity is normalization partly, this need to be combined into grid larger, the block that spatially links, the HOG descriptor is the vector of the histogrammic element of normalization grid, this histogram is by the zone of all blocks, these blocks usually can be overlapping, mean that each grid has affected last descriptor more than once, two main blocks exist for how much: one is the R-HOG block of rectangle, another is circular C-HOG block, in general the R-HOG block is that a plurality of grids are molecular, by three Parametric Representations: how many grids each block has, each grid has several pixels, and each grid histogram has How many channels do you, the present invention has adopted the R-HOG block, and obtaining by experiment optimum cell block division is 3x3 or 6x6 pixel, histogram is 9 passages simultaneously.
Two, the realization by LBP, described LBP principle is that central pixel point and the pixel around it are carried out size relatively, thereby obtained the sequence of a binaryzation, be converted to the modal representation of metric numerical value by the sequence with binaryzation, then the pattern of each pixel is added up to sort out and obtain a histogram, then the one-component in the corresponding histogram of each pattern, this histogram is used for follow-up identification mission as effective description of former figure.
Basic LBP operator is described as follows:
Can use LBPP, RThe LBP operator that represents arbitrary dimension, wherein (P, R) span decentering point radius is P sample point on the circumference of R, can realize the calculating of the LBP surrounding pixel point value of any P and R by bilinear interpolation, according to altogether 2pThe occurrence number of individual different mode, can obtain their LBP histogram, and accompanyingdrawing 3 has represented the implementation procedure of LBP.
If getting the peripheral information that any one pixel in image obtains it to compare more than or equal to this pixel with this pixel by a threshold value, be designated as 1, otherwise be designated as 0, obtain a binary value by a clockwise arrangement, obtain a decimal number by conversion, decimal numeral pattern for different, obtain different histograms.
In order to reduce histogrammic dimension, the present invention has adopted More General Form (uniform pattern), and the LBP pattern is redefined as LBPN, ruChoose n pixel in the scope of radius r, the conversion between 0-1 can not be greater than u.Such pattern is exactly More General Form (uniform pattern).
Be the foreground extraction mode due to what adopt, whether so need to first train, then obtaining information compares training result in image, thereby obtain this target, be people or car.
Three, based on the training of HOG-LBP mode, the training of gradient orientation histogram and local binary patterns method (HOG-LBP) realizes, flow process is as follows:
Step1: obtain training image (comprising positive sample and negative sample);
Step2: the gradient orientation histogram (HOG) that calculates this training image:
1, gradient calculation,
2, the direction dividing elements of statistics with histogram (Orientation binning),
3, descriptor block.
Step3: the local binary patterns (LBP) that calculates this training image:
1, choose More General Form LBP8,12,
2, the LBP pattern of calculating pixel periphery, be converted to metric pattern
3, statistics LBP histogram.
Step4: the histogram simultaneous of HOG and LBP is got up, form a training histogram;
Step5: the training histogram that will calculate is put into the svm classifier device and is trained, thereby obtains the Classification and Identification data.
Four, based on the realization of HOG-LBP mode, the identification of gradient orientation histogram and local binary patterns method (HOG-LBP) realizes, flow process is as follows:
Step1: obtain the image that needs identification;
Step2: the gradient orientation histogram (HOG) that calculates this training image:
1, gradient calculation,
2, the direction dividing elements of statistics with histogram (Orientation binning),
3, descriptor block.
Step3: the local binary patterns (LBP) that calculates this training image:
1, choose More General Form LBP8,12,
2, the LBP pattern of calculating pixel periphery, be converted to metric pattern,
3, statistics LBP histogram.
Step4: the histogram simultaneous of HOG and LBP is got up, form an identification histogram;
StepS: the recognition data that the identification histogram that will calculate and training stage draw carries out convolution, thereby obtains the output of a result;
Step6: if this is output as 1, this shows it is the people; If be output as 0, be indicated as vehicle; If output is-1, this shows it is other object, is defined as unrestrained thing here.
Like this,, by above method, realize detecting the classification of moving target, make target type can distinguish (people, vehicle, perhaps unrestrained thing), realize that artificial intelligence multi-mode behavior knowledge is another and describe.

Claims (4)

2. the preprocessing method of video signal of a kind of artificial intelligence multi-mode behavior recognition and description according to claim 1 is characterized in that:, by the realization of HOG, be first image to be divided into the little grid unit connected region that is called; Then gather gradient direction or the edge orientation histogram of each pixel in the grid unit; Finally altogether just can the constitutive characteristic descriptor these set of histograms, degree of the comparing normalization (contrast-normalized) in the larger interval of image (block) of these local histograms, the method is by first calculating the density of each histogram in this interval (block), then according to this density value, normalization is done in each grid unit in interval, after this normalization, can obtain better stability to illumination variation and shade.
CN2013103342699A2013-07-292013-07-29Video signal preprocessing method for artificial intelligent multimode behavior recognition and descriptionPendingCN103400138A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN2013103342699ACN103400138A (en)2013-07-292013-07-29Video signal preprocessing method for artificial intelligent multimode behavior recognition and description

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN2013103342699ACN103400138A (en)2013-07-292013-07-29Video signal preprocessing method for artificial intelligent multimode behavior recognition and description

Publications (1)

Publication NumberPublication Date
CN103400138Atrue CN103400138A (en)2013-11-20

Family

ID=49563756

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN2013103342699APendingCN103400138A (en)2013-07-292013-07-29Video signal preprocessing method for artificial intelligent multimode behavior recognition and description

Country Status (1)

CountryLink
CN (1)CN103400138A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN104978567A (en)*2015-06-112015-10-14武汉大千信息技术有限公司Vehicle detection method based on scenario classification
CN110135254A (en)*2019-04-122019-08-16华南理工大学 A Fatigue Expression Recognition Method
CN112887765A (en)*2021-01-082021-06-01武汉兴图新科电子股份有限公司Code rate self-adaptive adjustment system and method applied to cloud fusion platform

Citations (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20100067742A1 (en)*2008-09-122010-03-18Sony CorporationObject detecting device, imaging apparatus, object detecting method, and program
CN102663409A (en)*2012-02-282012-09-12西安电子科技大学Pedestrian tracking method based on HOG-LBP
CN102663366A (en)*2012-04-132012-09-12中国科学院深圳先进技术研究院Method and system for identifying pedestrian target
CN103150375A (en)*2013-03-112013-06-12浙江捷尚视觉科技有限公司Quick video retrieval system and quick video retrieval method for video detection

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20100067742A1 (en)*2008-09-122010-03-18Sony CorporationObject detecting device, imaging apparatus, object detecting method, and program
CN102663409A (en)*2012-02-282012-09-12西安电子科技大学Pedestrian tracking method based on HOG-LBP
CN102663366A (en)*2012-04-132012-09-12中国科学院深圳先进技术研究院Method and system for identifying pedestrian target
CN103150375A (en)*2013-03-112013-06-12浙江捷尚视觉科技有限公司Quick video retrieval system and quick video retrieval method for video detection

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
覃远霞: "基于数据挖掘工具的人脸识别LBP计算", 《制造业自动化》*
陈健斌: "图像特征提取及其相似度的研究和实现", 《中国优秀硕士学位论文全文数据库 信息科技辑》*

Cited By (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN104978567A (en)*2015-06-112015-10-14武汉大千信息技术有限公司Vehicle detection method based on scenario classification
CN104978567B (en)*2015-06-112018-11-20武汉大千信息技术有限公司Vehicle checking method based on scene classification
CN110135254A (en)*2019-04-122019-08-16华南理工大学 A Fatigue Expression Recognition Method
CN112887765A (en)*2021-01-082021-06-01武汉兴图新科电子股份有限公司Code rate self-adaptive adjustment system and method applied to cloud fusion platform
CN112887765B (en)*2021-01-082022-07-26武汉兴图新科电子股份有限公司 Bit rate adaptive adjustment system and method applied to cloud fusion platform

Similar Documents

PublicationPublication DateTitle
Wang et al.Data-driven based tiny-YOLOv3 method for front vehicle detection inducing SPP-net
ChenAutomatic License Plate Recognition via sliding-window darknet-YOLO deep learning
Liu et al.A rail surface defect detection method based on pyramid feature and lightweight convolutional neural network
Lim et al.Real-time traffic sign recognition based on a general purpose GPU and deep-learning
CN107122737B (en) A method for automatic detection and recognition of road traffic signs
US8447139B2 (en)Object recognition using Haar features and histograms of oriented gradients
Biglari et al.A cascaded part-based system for fine-grained vehicle classification
KR20170140214A (en)Filter specificity as training criterion for neural networks
CN104766042A (en)Method and apparatus for and recognizing traffic sign board
Wang et al.Hole-based traffic sign detection method for traffic signs with red rim
Patel et al.Automatic licenses plate recognition
Mammeri et al.North-American speed limit sign detection and recognition for smart cars
Rajesh et al.Coherence vector of oriented gradients for traffic sign recognition using neural networks
CN111274886A (en)Deep learning-based pedestrian red light violation analysis method and system
Liu et al.Enhancing intelligent road target monitoring: A novel BGS YOLO approach based on the YOLOv8 algorithm
CN103400138A (en)Video signal preprocessing method for artificial intelligent multimode behavior recognition and description
Tian et al.License plate detection in an open environment by density-based boundary clustering
Moseva et al.Development of a System for Fixing Road Markings in Real Time
Nguyen et al.Fast traffic sign detection under challenging conditions
Ganapathi et al.Design and implementation of an automatic traffic sign recognition system on TI OMAP-L138
LiObject detection and instance segmentation of cables
Muchtar et al.Attention-based approach for efficient moving vehicle classification
Rezaei et al.An efficient method for license plate localization using multiple statistical features in a multilayer perceptron neural network
Kumar et al.Image processing based system for classification of vehicles for parking purposes
Zhang et al.Beyond sliding windows: Object detection based on hierarchical segmentation model

Legal Events

DateCodeTitleDescription
C06Publication
PB01Publication
C10Entry into substantive examination
SE01Entry into force of request for substantive examination
C02Deemed withdrawal of patent application after publication (patent law 2001)
WD01Invention patent application deemed withdrawn after publication

Application publication date:20131120


[8]ページ先頭

©2009-2025 Movatter.jp