Summary of the invention
The object of the present invention is to provide a kind of warning method and devices based on the separation of accurate people's vehicle, for guarding against monitoring fieldThe target alarmed under scape triggering is fast and accurately positioned and is identified, helping the target of monitoring personnel analysis triggering alarm is that people goes backIt is vehicle.
In order to achieve the above objectives, the technical scheme of the present invention is realized as follows:
A kind of warning method based on the separation of accurate people's vehicle, comprising:
1) moving region in warning region is obtained using background modeling technology, determines candidate using mass detection methodThe ROI of inspection image is taken off in region according to candidate region;
2) the inspection image taken off is sent into trained deep learning and detects network, the people in image, vehicle are examinedIt surveys and identifies, export its boundary rectangle, classification and confidence level.
Further, the specific method of step 1) obtains foreground image including the use of background modeling technology, is examined using agglomerateAdjacent foreground point is carried out subject fusion, and takes its boundary rectangle as motion mass, and according to motion mass by survey methodThe ratio that prospect points account for total pixel number in size ratio and agglomerate excludes the agglomerate to differ greatly with people, vehicle, remainingAs candidate region, inspection image is taken off according to candidate region.
Further, the method for the step 2) detection includes:
201) people's vehicle training dataset is constructed;
202) according to people's vehicle feature construction deep learning training network architecture;
203) configuration training parameter starts to train;
204) people's vehicle is detected with training pattern.
Further, the specific method of step 201) building people's vehicle training dataset includes: to guard against phase from magnanimityMachine is captured in image, and people, the vehicle image under different angle, different light conditions with triggering warning alert if are obtained, will be formerBeginning 1920X1080 pixel image zooms to 1080X720,540X360, three groups of multi-scale images is obtained, to every sub-picture with minimumSide intercepts two square-shaped images as training image, then using mark from left and right to each image respectively as side lengthTool marks people, the position of vehicle and classification in the picture, generates the mark file of xml format, and image is saved and is used as data set,Taking 80% data, remainder data is as test set as training dataset.
Further, step 202) the method packet according to people's vehicle feature construction deep learning training network architectureInclude: the network of building is divided into feature extraction and uses since the second layer more with detection two parts, feature extraction network is mergedThe alternate convolutional network structure of Conv3*3 and conv1*1,64 feature map that one layer is obtained, is divided into two groups every groupEach 32, the filter of 128 3*3 of the second layer is also divided into two groups, respectively does convolution fortune with corresponding feature mapIt calculates, then is together in series the feature map that two groups of convolution algorithms obtain to obtain new featuer with the filter of 1*1Input of the map as detection layers and next layer of convolution, back layer calculation are identical with this, and the detection parts that merge collect difference moreThe feature map of size, using prediction block position (x, y, w, h) and confidence level confidence with true value position intoRow calculates loss, obtains mbox_loss, final output makes penalty values the smallest (x, y, w, h) and confidence, and passes throughClassification belonging to the confidence vector determination target frame of output.
Another aspect of the present invention additionally provides a kind of surveillance device based on the separation of accurate people's vehicle, comprising:
Module is taken off, for obtaining the moving region in warning region using background modeling technology, uses mass detection sideMethod determines candidate region, and the ROI of inspection image is taken off according to candidate region;
Deep learning module, the inspection image for will take off is sent into trained deep learning and detects network, to imageIn people, vehicle detect and identify, export its boundary rectangle, classification and confidence level.
Further, the module of taking off is specifically used for: obtaining foreground image using background modeling technology, is examined using agglomerateAdjacent foreground point is carried out subject fusion, and takes its boundary rectangle as motion mass, and according to motion mass by survey methodThe ratio that prospect points account for total pixel number in size ratio and agglomerate excludes the agglomerate to differ greatly with people, vehicle, remainingAs candidate region, inspection image is taken off according to candidate region.
Further, the deep learning module includes:
Data set unit, for constructing people's vehicle training dataset;
Architectural elements, for according to people's vehicle feature construction deep learning training network architecture;
Training unit starts to train for configuring training parameter;
Detection unit, for being detected using training pattern to people's vehicle.
Further, data set unit is specifically used for: it is captured in image from magnanimity warning camera, acquisition different angle,There is people, the vehicle image of triggering warning alert if under different light conditions, original 1920X1080 pixel image is zoomed to1080X720,540X360 obtain three groups of multi-scale images, to every sub-picture using minimum edge as side length, scheme respectively to eachAs, as training image, then marking the position of people, vehicle in the picture using annotation tool from left and right two square-shaped images of interceptionIt sets and classification, generates the mark file of xml format, image is saved and is used as data set, take 80% data as training dataCollection, remainder data is as test set.
Further, architectural elements are specifically used for: the network of building is divided into feature extraction and merges detection two parts with more,Feature extraction network uses the alternate convolutional network structure of Conv3*3 and conv1*1 since the second layer, and one layer is obtained64 feature map, are divided into two groups every group each 32, the filter of 128 3*3 of the second layer are also divided into two groups, respectivelyConvolution algorithm is done with corresponding feature map, the feature for then two groups of convolution algorithms being obtained with the filter of 1*1Map is together in series to obtain input of the new featuer map as detection layers and next layer of convolution, back layer calculation and thisIdentical, the detection parts that merge collect various sizes of feature map more, utilize the position (x, y, w, h) of prediction block and confidenceIt spends confidence and carries out calculating loss with the position of true value, obtain mbox_loss, final output makes penalty values the smallest(x, y, w, h) and confidence, and pass through classification belonging to the confidence vector determination target frame of output.
Compared with prior art, the present invention have it is following the utility model has the advantages that
The present invention has stronger Shandong nation property to the differentiation of target category and the accurate positioning of position, can also accurate mistakeFilter warning alarm caused by inhuman, vehicle.
Specific embodiment
It should be noted that in the absence of conflict, the feature in embodiment and embodiment in the present invention can phaseMutually combination.
The present invention proposes a kind of warning method based on the separation of accurate people's vehicle, and the technical scheme comprises the following steps:
1. obtaining the moving region in warning region using background modeling technology, candidate is determined using mass detection methodThe ROI of inspection image is taken off in region according to candidate region;
2. the inspection image taken off, which is sent into trained deep learning, detects network, the people in image, vehicle are examinedIt surveys and identifies, export its boundary rectangle, classification and confidence level;
Foreground image is obtained using background modeling technology in the step 1, using mass detection method, adjacent prospectPoint carries out subject fusion, and takes its boundary rectangle as motion mass.If regular-shape motion agglomerate is expressed as (x, y, w, h), prospectProspect points n1, total pixel number n2 are contained in image corresponding region, then select the agglomerate for meeting the following conditions as candidateInspection image is taken off according to candidate region in region;
0.15<(w/h)<2 and (n1/n2)>0.3
The inspection image taken off is sent into trained deep learning in the step 2 and detects network, in image people,Vehicle is detected and is identified, its boundary rectangle, classification and confidence level are exported.It is embodied as follows:
1) people's vehicle training dataset is constructed;
To make model that there is good generalized ability, various scenes are adapted to, captures in image, obtains from magnanimity warning cameraThere is people, the vehicle image of triggering warning alert if, since the warning image of acquisition is mostly under different angle, different light conditionsThe image of (1920X1080) pixel zooms to original image for the multiple dimensioned recognition capability for enhancing network model(1080X720), (540X360) obtain three groups of multi-scale images, while for convenience of network training, to every sub-picture with minimum edgeAs side length, two square-shaped images are intercepted as training image, then using mark work from left and right to each image respectivelyTool marks people, the position of vehicle and classification in the picture, generates the mark file of xml format, image is saved in JPEGImagesFile, xml mark file is put into Annotations file, as data set, takes 80% data as training dataCollection, remainder data is as test set;
2) according to people's vehicle feature construction deep learning training network architecture;
Such as Fig. 1, the network of building is divided into feature extraction and opens with detection two parts, feature extraction network is merged from the second layer moreBegin to be divided into two using the alternate convolutional network structure of Conv3*3 and conv1*1,64 feature map that one layer is obtainedGroup every group each 32, the filter of 128 3*3 of the second layer is also divided into two groups, respectively with corresponding feature map volumeProduct operation, then the feature map that two groups of convolution algorithms obtain is together in series to obtain new with the filter of 1*1Input of the featuer map as detection layers and next layer of convolution, back layer calculation are identical with this, mostly fusion detection partVarious sizes of feature map is collected, the position l (x, y, w, h) (output of mbox_loc) and confidence level of prediction block are utilizedConfidence (output of mbox_cof) carries out calculating loss with true value g (x, y, w, h), obtains mbox_loss, calculates publicFormula are as follows:
Wherein p indicates target generic,Indicate i-th of prediction block and j-th true frame about classification k whetherMatch, otherwise matching as 1 is 0.
Final output makes penalty values the smallest (x, y, w, h) and confidence, and passes through the confidence of outputClassification belonging to vector determination target frame.
3) configuration training parameter starts to train;
The adjustment of network inputs parameter is most important to accuracy rate and the recall rate improvement of model, and the network mainly adjusted is defeatedEntering parameter has image preprocessing parameter, brightness is done to image, the adjustment of coloration, saturation degree achieve the purpose that data enhance, andTraining parameter batchsize, learning rate obtain the optimum value of bachsize by repetition training, and learning rate is adopted in training processWith the method adaptively adjusted, formula is adjusted.
D is that constant takes 10-5
4) people's vehicle is detected with training pattern;
Using trained model to inspection image detection output inspection image in the position (x, y, w, h) of people or vehicle with setReliability.Retain size properly and confidence level reaches the target of threshold value as output.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all in essence of the inventionWithin mind and principle, any modification, equivalent replacement, improvement and so on be should all be included in the protection scope of the present invention.