CN109359573A

Movatterモバイル変換

Info

Publication number: CN109359573A
Application number: CN201811160160.7A
Authority: CN
Inventors: 高旭麟; 薛超; 付邦鹏; 焦胜峰
Original assignee: Tianjin Tiandi Weiye Investment Management Co ltd
Current assignee: Tianjin Tiandi Weiye Investment Management Co ltd
Priority date: 2018-09-30
Filing date: 2018-09-30
Publication date: 2019-02-19

Abstract

The invention discloses a kind of warning methods based on the separation of accurate people's vehicle, this method passes through the moving region in background modeling technology acquisition warning region first, candidate region is determined using mass detection method, then inspection image is taken off according to candidate region, with deep learning detection network to inspection image detection, the differentiation of people's vehicle to triggering warning alarm target is realized in target category and exact position in last output area.This method has stronger Shandong nation property to the differentiation of target category and the accurate positioning of position, can also accurately filter out warning alarm caused by inhuman, vehicle.

Description

A kind of warning method and device based on the separation of accurate people's vehicle

Technical field

The invention belongs to field of video monitoring, and in particular to the warning side based on the separation of accurate people's vehicle in a kind of video monitoringMethod and device.

Background technique

With the development of science and technology and society, more and more extensive, the existing camera vigilance performance of video monitoring system applicationIt is increasingly unable to satisfy the demand of practical application, is mainly reflected in, the demand under current warning scene not only needs phase equipmentThere is the detection function to moving target, it is also necessary to which camera automatically can accurately divide the moving object detectedA kind of class, since to trigger the target of alarm be mostly people or vehicle under warning scene, it is therefore desirable to the attribute of method to alarm targetIt carries out accurately identifying classification, existing warning camera can be identified to a certain extent using the detection classification method such as cascade, svmAlarm target attribute out, but because its requirement to scene is very stringent, and recognition accuracy is lower, is difficult wide popularization and application.

Summary of the invention

The object of the present invention is to provide a kind of warning method and devices based on the separation of accurate people's vehicle, for guarding against monitoring fieldThe target alarmed under scape triggering is fast and accurately positioned and is identified, helping the target of monitoring personnel analysis triggering alarm is that people goes backIt is vehicle.

In order to achieve the above objectives, the technical scheme of the present invention is realized as follows:

A kind of warning method based on the separation of accurate people's vehicle, comprising:

1) moving region in warning region is obtained using background modeling technology, determines candidate using mass detection methodThe ROI of inspection image is taken off in region according to candidate region；

2) the inspection image taken off is sent into trained deep learning and detects network, the people in image, vehicle are examinedIt surveys and identifies, export its boundary rectangle, classification and confidence level.

Further, the specific method of step 1) obtains foreground image including the use of background modeling technology, is examined using agglomerateAdjacent foreground point is carried out subject fusion, and takes its boundary rectangle as motion mass, and according to motion mass by survey methodThe ratio that prospect points account for total pixel number in size ratio and agglomerate excludes the agglomerate to differ greatly with people, vehicle, remainingAs candidate region, inspection image is taken off according to candidate region.

Further, the method for the step 2) detection includes:

201) people's vehicle training dataset is constructed；

202) according to people's vehicle feature construction deep learning training network architecture；

203) configuration training parameter starts to train；

204) people's vehicle is detected with training pattern.

Further, the specific method of step 201) building people's vehicle training dataset includes: to guard against phase from magnanimityMachine is captured in image, and people, the vehicle image under different angle, different light conditions with triggering warning alert if are obtained, will be formerBeginning 1920X1080 pixel image zooms to 1080X720,540X360, three groups of multi-scale images is obtained, to every sub-picture with minimumSide intercepts two square-shaped images as training image, then using mark from left and right to each image respectively as side lengthTool marks people, the position of vehicle and classification in the picture, generates the mark file of xml format, and image is saved and is used as data set,Taking 80% data, remainder data is as test set as training dataset.

Further, step 202) the method packet according to people's vehicle feature construction deep learning training network architectureInclude: the network of building is divided into feature extraction and uses since the second layer more with detection two parts, feature extraction network is mergedThe alternate convolutional network structure of Conv3*3 and conv1*1,64 feature map that one layer is obtained, is divided into two groups every groupEach 32, the filter of 128 3*3 of the second layer is also divided into two groups, respectively does convolution fortune with corresponding feature mapIt calculates, then is together in series the feature map that two groups of convolution algorithms obtain to obtain new featuer with the filter of 1*1Input of the map as detection layers and next layer of convolution, back layer calculation are identical with this, and the detection parts that merge collect difference moreThe feature map of size, using prediction block position (x, y, w, h) and confidence level confidence with true value position intoRow calculates loss, obtains mbox_loss, final output makes penalty values the smallest (x, y, w, h) and confidence, and passes throughClassification belonging to the confidence vector determination target frame of output.

Another aspect of the present invention additionally provides a kind of surveillance device based on the separation of accurate people's vehicle, comprising:

Module is taken off, for obtaining the moving region in warning region using background modeling technology, uses mass detection sideMethod determines candidate region, and the ROI of inspection image is taken off according to candidate region；

Deep learning module, the inspection image for will take off is sent into trained deep learning and detects network, to imageIn people, vehicle detect and identify, export its boundary rectangle, classification and confidence level.

Further, the module of taking off is specifically used for: obtaining foreground image using background modeling technology, is examined using agglomerateAdjacent foreground point is carried out subject fusion, and takes its boundary rectangle as motion mass, and according to motion mass by survey methodThe ratio that prospect points account for total pixel number in size ratio and agglomerate excludes the agglomerate to differ greatly with people, vehicle, remainingAs candidate region, inspection image is taken off according to candidate region.

Further, the deep learning module includes:

Data set unit, for constructing people's vehicle training dataset；

Architectural elements, for according to people's vehicle feature construction deep learning training network architecture；

Training unit starts to train for configuring training parameter；

Detection unit, for being detected using training pattern to people's vehicle.

Further, data set unit is specifically used for: it is captured in image from magnanimity warning camera, acquisition different angle,There is people, the vehicle image of triggering warning alert if under different light conditions, original 1920X1080 pixel image is zoomed to1080X720,540X360 obtain three groups of multi-scale images, to every sub-picture using minimum edge as side length, scheme respectively to eachAs, as training image, then marking the position of people, vehicle in the picture using annotation tool from left and right two square-shaped images of interceptionIt sets and classification, generates the mark file of xml format, image is saved and is used as data set, take 80% data as training dataCollection, remainder data is as test set.

Further, architectural elements are specifically used for: the network of building is divided into feature extraction and merges detection two parts with more,Feature extraction network uses the alternate convolutional network structure of Conv3*3 and conv1*1 since the second layer, and one layer is obtained64 feature map, are divided into two groups every group each 32, the filter of 128 3*3 of the second layer are also divided into two groups, respectivelyConvolution algorithm is done with corresponding feature map, the feature for then two groups of convolution algorithms being obtained with the filter of 1*1Map is together in series to obtain input of the new featuer map as detection layers and next layer of convolution, back layer calculation and thisIdentical, the detection parts that merge collect various sizes of feature map more, utilize the position (x, y, w, h) of prediction block and confidenceIt spends confidence and carries out calculating loss with the position of true value, obtain mbox_loss, final output makes penalty values the smallest(x, y, w, h) and confidence, and pass through classification belonging to the confidence vector determination target frame of output.

Compared with prior art, the present invention have it is following the utility model has the advantages that

The present invention has stronger Shandong nation property to the differentiation of target category and the accurate positioning of position, can also accurate mistakeFilter warning alarm caused by inhuman, vehicle.

Detailed description of the invention

Fig. 1 is the deep learning training network architecture schematic diagram of the embodiment of the present invention.

Specific embodiment

It should be noted that in the absence of conflict, the feature in embodiment and embodiment in the present invention can phaseMutually combination.

The present invention proposes a kind of warning method based on the separation of accurate people's vehicle, and the technical scheme comprises the following steps:

1. obtaining the moving region in warning region using background modeling technology, candidate is determined using mass detection methodThe ROI of inspection image is taken off in region according to candidate region；

2. the inspection image taken off, which is sent into trained deep learning, detects network, the people in image, vehicle are examinedIt surveys and identifies, export its boundary rectangle, classification and confidence level；

Foreground image is obtained using background modeling technology in the step 1, using mass detection method, adjacent prospectPoint carries out subject fusion, and takes its boundary rectangle as motion mass.If regular-shape motion agglomerate is expressed as (x, y, w, h), prospectProspect points n1, total pixel number n2 are contained in image corresponding region, then select the agglomerate for meeting the following conditions as candidateInspection image is taken off according to candidate region in region；

0.15<(w/h)<2 and (n1/n2)>0.3

The inspection image taken off is sent into trained deep learning in the step 2 and detects network, in image people,Vehicle is detected and is identified, its boundary rectangle, classification and confidence level are exported.It is embodied as follows:

1) people's vehicle training dataset is constructed；

To make model that there is good generalized ability, various scenes are adapted to, captures in image, obtains from magnanimity warning cameraThere is people, the vehicle image of triggering warning alert if, since the warning image of acquisition is mostly under different angle, different light conditionsThe image of (1920X1080) pixel zooms to original image for the multiple dimensioned recognition capability for enhancing network model(1080X720), (540X360) obtain three groups of multi-scale images, while for convenience of network training, to every sub-picture with minimum edgeAs side length, two square-shaped images are intercepted as training image, then using mark work from left and right to each image respectivelyTool marks people, the position of vehicle and classification in the picture, generates the mark file of xml format, image is saved in JPEGImagesFile, xml mark file is put into Annotations file, as data set, takes 80% data as training dataCollection, remainder data is as test set；

2) according to people's vehicle feature construction deep learning training network architecture；

Such as Fig. 1, the network of building is divided into feature extraction and opens with detection two parts, feature extraction network is merged from the second layer moreBegin to be divided into two using the alternate convolutional network structure of Conv3*3 and conv1*1,64 feature map that one layer is obtainedGroup every group each 32, the filter of 128 3*3 of the second layer is also divided into two groups, respectively with corresponding feature map volumeProduct operation, then the feature map that two groups of convolution algorithms obtain is together in series to obtain new with the filter of 1*1Input of the featuer map as detection layers and next layer of convolution, back layer calculation are identical with this, mostly fusion detection partVarious sizes of feature map is collected, the position l (x, y, w, h) (output of mbox_loc) and confidence level of prediction block are utilizedConfidence (output of mbox_cof) carries out calculating loss with true value g (x, y, w, h), obtains mbox_loss, calculates publicFormula are as follows:

Wherein p indicates target generic,Indicate i-th of prediction block and j-th true frame about classification k whetherMatch, otherwise matching as 1 is 0.

Final output makes penalty values the smallest (x, y, w, h) and confidence, and passes through the confidence of outputClassification belonging to vector determination target frame.

3) configuration training parameter starts to train；

The adjustment of network inputs parameter is most important to accuracy rate and the recall rate improvement of model, and the network mainly adjusted is defeatedEntering parameter has image preprocessing parameter, brightness is done to image, the adjustment of coloration, saturation degree achieve the purpose that data enhance, andTraining parameter batchsize, learning rate obtain the optimum value of bachsize by repetition training, and learning rate is adopted in training processWith the method adaptively adjusted, formula is adjusted.

D is that constant takes 10^-5

4) people's vehicle is detected with training pattern；

Using trained model to inspection image detection output inspection image in the position (x, y, w, h) of people or vehicle with setReliability.Retain size properly and confidence level reaches the target of threshold value as output.

The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all in essence of the inventionWithin mind and principle, any modification, equivalent replacement, improvement and so on be should all be included in the protection scope of the present invention.

Claims

1. a kind of warning method based on the separation of accurate people's vehicle characterized by comprising

1) moving region in warning region is obtained using background modeling technology, determines candidate regions using mass detection methodThe ROI of inspection image is taken off in domain according to candidate region；

2) the inspection image taken off is sent into trained deep learning and detects network, the people in image, vehicle are detected andIdentification, exports its boundary rectangle, classification and confidence level.

2. a kind of warning method based on the separation of accurate people's vehicle according to claim 1, which is characterized in that the tool of step 1)Body method obtains foreground image including the use of background modeling technology, and using mass detection method, adjacent foreground point is carried out meshMark fusion, and take its boundary rectangle as motion mass, and accounted for according to prospect points in the size of motion mass ratio and agglomerateThe ratio of total pixel number excludes the agglomerate to differ greatly with people, vehicle, using remaining as candidate region, according to candidate regionsTake off inspection image in domain.

3. a kind of warning method based on the separation of accurate people's vehicle according to claim 1, which is characterized in that step 2) is describedThe method of detection includes:

201) people's vehicle training dataset is constructed；

203) configuration training parameter starts to train；

204) people's vehicle is detected with training pattern.

4. a kind of warning method based on the separation of accurate people's vehicle according to claim 3, which is characterized in that step 201) instituteThe specific method for stating building people's vehicle training dataset includes: to capture in image from magnanimity warning camera, obtains different angle, differenceThere is people, the vehicle image of triggering warning alert if under light conditions, original 1920X1080 pixel image is zoomed to1080X720,540X360 obtain three groups of multi-scale images, to every sub-picture using minimum edge as side length, scheme respectively to eachAs, as training image, then marking the position of people, vehicle in the picture using annotation tool from left and right two square-shaped images of interceptionIt sets and classification, generates the mark file of xml format, image is saved and is used as data set, take 80% data as training dataCollection, remainder data is as test set.

5. a kind of warning method based on the separation of accurate people's vehicle according to claim 3, which is characterized in that step 202) instituteIt states and is divided into feature extraction according to the network that the method for people's vehicle feature construction deep learning training network architecture includes: building and melts moreDetection two parts are closed, feature extraction network uses the alternate convolutional network knot of Conv3*3 and conv1*1 since the second layerStructure, 64 feature map that one layer is obtained are divided into two groups every group each 32, by the filter of 128 3*3 of the second layerTwo groups are also divided into, respectively does convolution algorithm with corresponding feature map, then uses the filter of 1*1 by two groups of convolution algorithmsObtained feature map is together in series to obtain input of the new featuer map as detection layers and next layer of convolution, behindLayer calculation is identical with this, and the detection parts that merge collect various sizes of feature map more, utilizes the position of prediction block(x, y, w, h) and confidence level confidence carry out calculating loss with the position of true value, obtain mbox_loss, final outputSo that penalty values the smallest (x, y, w, h) and confidence, and pass through the confidence vector determination target frame institute of outputThe classification of category.

6. a kind of surveillance device based on the separation of accurate people's vehicle characterized by comprising

Module is taken off, it is true using mass detection method for obtaining the moving region in warning region using background modeling technologyCandidate region is made, the ROI of inspection image is taken off according to candidate region；

Deep learning module, the inspection image for will take off is sent into trained deep learning and detects network, in imagePeople, vehicle detect and identify, export its boundary rectangle, classification and confidence level.

7. a kind of surveillance device based on the separation of accurate people's vehicle according to claim 6, which is characterized in that described to take off mouldBlock is specifically used for: obtaining foreground image using background modeling technology, using mass detection method, adjacent foreground point is carried out meshMark fusion, and take its boundary rectangle as motion mass, and accounted for according to prospect points in the size of motion mass ratio and agglomerateThe ratio of total pixel number excludes the agglomerate to differ greatly with people, vehicle, using remaining as candidate region, according to candidate regionsTake off inspection image in domain.

8. a kind of surveillance device based on the separation of accurate people's vehicle according to claim 6, which is characterized in that the depthPractising module includes:

Data set unit, for constructing people's vehicle training dataset；

Training unit starts to train for configuring training parameter；

Detection unit, for being detected using training pattern to people's vehicle.

9. a kind of surveillance device based on the separation of accurate people's vehicle according to claim 8, which is characterized in that data set unitIt is specifically used for: is captured in image from magnanimity warning camera, obtaining has triggering warning alarm under different angle, different light conditionsOriginal 1920X1080 pixel image is zoomed to 1080X720,540X360 by the people of condition, vehicle image, obtain three groups it is multiple dimensionedImage intercepts two square-shaped image conducts from left and right to each image respectively to every sub-picture using minimum edge as side lengthTraining image then marks people, the position of vehicle and classification using annotation tool in the picture, generates the mark file of xml format,Image is saved and is used as data set, taking 80% data, remainder data is as test set as training dataset.

10. a kind of surveillance device based on the separation of accurate people's vehicle according to claim 8, which is characterized in that architectural elementsBe specifically used for: the network of building is divided into feature extraction and adopts since the second layer more with detection two parts, feature extraction network is mergedWith the alternate convolutional network structure of Conv3*3 and conv1*1,64 feature map that one layer is obtained, be divided into two groups it is everyGroup each 32, the filter of 128 3*3 of the second layer is also divided into two groups, respectively does convolution fortune with corresponding feature mapIt calculates, then is together in series the feature map that two groups of convolution algorithms obtain to obtain new featuer with the filter of 1*1Input of the map as detection layers and next layer of convolution, back layer calculation are identical with this, and the detection parts that merge collect difference moreThe feature map of size, using prediction block position (x, y, w, h) and confidence level confidence with true value position intoRow calculates loss, obtains mbox_loss, final output makes penalty values the smallest (x, y, w, h) and confidence, and passes throughClassification belonging to the confidence vector determination target frame of output.