CN117372919B

Movatterモバイル変換

Info

Publication number: CN117372919B
Application number: CN202311228765.6A
Authority: CN
Inventors: 王一君; 许梦玉; 姜海; 李彦爽; 邓井川; 邢琳琳; 祁丽荣; 田川
Original assignee: Beijing Gas Group Co Ltd
Current assignee: Beijing Gas Group Co Ltd
Priority date: 2023-09-22
Filing date: 2023-09-22
Publication date: 2024-07-19
Anticipated expiration: 2043-09-22
Also published as: CN117372919A

Abstract

The invention provides a method and a device for detecting a threat in third party construction. The method comprises the following steps: building a training data set based on images acquired at a third party construction site, and training a target detection model; inputting the video image into a trained target detection model for feature extraction to obtain three image features with different sizes; and performing target recognition based on three image features with different sizes, and outputting the category of the third party construction threat. According to the invention, the target detection model is utilized to extract the characteristics of the input video image, three image characteristics with different sizes are obtained, and the target recognition is carried out based on the three image characteristics with different sizes, so that the third party construction threat can be accurately recognized.

Description

Third party construction threat detection method and device

Technical Field

The invention belongs to the technical field of target detection, and particularly relates to a method and a device for detecting a threat in third party construction.

Background

Energy transport pipelines such as petroleum, natural gas and the like are the infrastructure for modern economic development. However, the pipe is damaged or corroded during use due to the influence of external environment, internal transport substances and other factors. The consequences of leakage or breakage of the pipe are not envisaged. In the accident of external damage, the pipeline damage caused by the third party construction is relatively large. The third party construction refers to the behavior of non-corporate staff in developing various engineering constructions such as road construction, bridge construction, factory building construction, 500 m internal blasting construction, 100m downstream riverbed river dredging construction and the like within 50m range on two sides of a pipeline center line, and comprises construction directly and indirectly causing pipeline hazard and risk. The influence of the third party construction mainly comprises direct threat factors (electric engineering devices such as an excavator, a pile driver and the like, manual operation devices such as a shovel and a manual electric drill) and indirect threat factors (operations such as illegal occupation construction, rolling operation of heavy vehicles and the like), and the identification of the third party construction threat is the identification of the direct threat factors and the indirect threat factors. In the oil and gas pipeline inspection application, manual detection is generally adopted for identifying the situations, and the defects of long time consumption, high time cost and the like are overcome.

In view of the above, the present invention provides a third party construction threat detection method and apparatus based on deep learning.

Disclosure of Invention

In order to solve the problems in the prior art, the invention provides a method and a device for detecting a threat in third party construction.

In order to achieve the above object, the present invention adopts the following technical scheme.

In a first aspect, the present invention provides a third party construction threat detection method, comprising the steps of:

Building a training data set based on images acquired at a third party construction site, and training a target detection model;

inputting the video image into a trained target detection model for feature extraction to obtain three image features with different sizes;

And performing target recognition based on three image features with different sizes, and outputting the category of the third party construction threat.

Further, the step of establishing a training data set based on the image acquired by the third party construction site comprises the following steps:

Data acquisition is carried out on a third party construction site, the acquired images are grouped according to the types of threat objects, and each type of the acquired images selects image samples with close quantity;

cleaning the selected image samples, and removing some unclear image samples;

marking the cleaned image sample, and marking a target frame and a category which represent the position and the size of the target;

denoising the marked image sample;

And converting the denoised image into a gray image, and dividing the image sample into a training data set and a test data set according to the ratio of 4:1.

Still further, the categories of third party construction threats include: the construction equipment comprises a safety helmet, a reflective coating, a pneumatic pick, a soil pile, a construction fence, a water horse, a construction warning sign, a bulldozer, an excavator, a road roller, a spade, an earthmoving vehicle, a roadblock and an engineering protective fence.

Furthermore, a Gaussian filter based on convolution operation is adopted to carry out denoising processing on the image sample, and a convolution formula of the Gaussian kernel is as follows:

where G (i, j) is the pixel value at coordinates (i, j) of the input image sample after gaussian smoothing, and σ is the standard deviation of the gaussian kernel.

Still further, the color RGB image is converted to a gray scale using the RGB2gray function in skimage image processing libraries.

Further, the target detection model comprises an input end, a backbone network backup, a connection network neg and an output end.

Further, when training the target detection model, performing data enhancement processing on the input image samples, and performing stitching after performing random scaling, random cutting and random arrangement on 4 input image samples.

Further, the method further comprises: when training the target detection model, automatically screening the target detection frame according to the following steps:

Obtaining the confidence coefficient of the target existing in each target detection frame, wherein s_i is the confidence coefficient of the target existing in the ith target detection frame, i=1, 2, … and N; n is the number of target detection frames;

the intersection ratio IOU of each target detection frame and each alternative frame is calculated, and the formula is as follows:

Wherein a_i is the ith target detection frame, B_j is the jth alternative frame, the alternative frame is the marked target frame, a_i∩B_j represents the area of intersection of a_i、B_j, a_i∪B_j represents the area of union of a_i、B_j, j=1, 2, …, M; m is the number of alternative frames;

The score of each target detection frame is calculated as follows:

If S_i>S₀, reserving a target detection frame A_i; otherwise, the deletion target detection frame a_i;S₀ is a set threshold.

Still further, the method further comprises:

If S_i>S₀, the target detection box a_i is reserved, and all B_j satisfying S_i×IOU(A_i,B_j)＞S₀ are calculated and denoted as B_jk, k=1, 2, …, K;

Calculating the position of the target detection frame A_i:

where a_i-O is the center position coordinate of the target detection frame a_i, and B_jk-O is the center position coordinate of the candidate frame B_jk.

In a second aspect, the present invention provides a third party construction threat detection apparatus based on deep learning, including:

the model training module is used for establishing a training data set based on images acquired at a third party construction site and training a target detection model;

The feature extraction module is used for inputting the video image into the trained target detection model to perform feature extraction to obtain three image features with different sizes;

And the target recognition module is used for carrying out target recognition based on three image features with different sizes and outputting the class of the third-party construction threat.

Compared with the prior art, the invention has the following beneficial effects.

According to the invention, the training data set is established based on the images acquired at the third party construction site, the target detection model is trained, the video image is input into the trained target detection model for feature extraction, three image features with different sizes are obtained, the target recognition is carried out based on the three image features with different sizes, the category of the third party construction threat is output, and the automatic detection of the third party construction threat is realized. According to the invention, the target detection model is utilized to extract the characteristics of the input video image, three image characteristics with different sizes are obtained, and the target recognition is carried out based on the three image characteristics with different sizes, so that the third party construction threat can be accurately recognized.

Drawings

Fig. 1 is a flowchart of a third party construction threat detection method based on deep learning according to an embodiment of the invention.

Fig. 2 is a schematic diagram of a network structure of the object detection model.

Fig. 3 is a block diagram of a third party construction threat detection apparatus based on deep learning according to an embodiment of the invention.

Detailed Description

The present invention will be further described with reference to the drawings and the detailed description below, in order to make the objects, technical solutions and advantages of the present invention more apparent. It will be apparent that the described embodiments are only some, but not all, embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Fig. 1 is a flowchart of a third party construction threat detection method according to an embodiment of the invention, including the steps of:

Step 101, building a training data set based on images acquired at a third party construction site, and training a target detection model;

102, inputting a video image into a trained target detection model for feature extraction to obtain three image features with different sizes;

and 103, performing target recognition based on three image features with different sizes, and outputting the class of the third party construction threat.

In this embodiment, step 101 is mainly used for building a training data set and training a target detection model. In the embodiment, a target detection model is adopted to identify a third party construction threat from a video image photographed by implementation, so as to judge whether third party construction is being performed. The third party construction refers to the behavior that non-company staff develop various engineering constructions such as road construction, bridge construction, factory building construction, 500 m internal blasting construction, 100 m downstream riverbed river dredging construction and the like in the range of 50m on two sides of the central line of the gas pipeline, and comprises construction which directly and indirectly causes pipeline hazard and risk. In order to enable the target detection model to accurately identify the third party construction threat, the embodiment establishes a training data set by collecting real construction scene video images from a third party construction site, trains the target detection model by using the training data set, optimizes model parameters by establishing a loss function and utilizes a back propagation method.

In this embodiment, step 102 is mainly used for image feature extraction. In the embodiment, video images acquired in real time are input into a trained target detection model, and image feature extraction is performed to obtain three image features with different sizes. Because the types of third party construction threats are various, and the large, medium and small sizes are different greatly, the convolution kernels with the same size are difficult to extract the features of all the objects at the same time, and the three convolution kernels with different sizes are adopted for extracting the features so as to extract the image features of the objects with different sizes effectively.

In this embodiment, step 103 is mainly used for third party construction threat identification. In this embodiment, target recognition is performed based on three image features of different sizes extracted in the previous step, a target detection frame (circumscribed rectangle of the target image) is generated, confidence is calculated, and finally, the category of each object is obtained based on the highest confidence.

As an optional embodiment, the building a training data set based on the image collected by the third party construction site includes:

cleaning the selected image samples, and removing some unclear image samples;

denoising the marked image sample;

The present embodiment provides a technical solution for creating a training data set. First, third party construction data collection is performed, and each type collects images with similar quantity, for example, about 300 images are collected. And secondly, cleaning the data, and discarding some pictures with small markers and unclear image samples. And then, marking the image by using a marking tool, and marking the position and the type of the object in the picture so as to cover the upper, lower, left and right edges of the object as much as possible. And after the marking is finished, generating a related file of the marking frame, namely a data set matching file, and then obtaining a third party construction data sample library. In the actual use process, the images are identified in the night scene, and in order to match the on-site camera equipment and the actually deployed identification scene, denoising and gray-scale image conversion processing are also carried out on the images in the database. The denoising processing of the image can eliminate the influence of noise generated in the field image acquisition process on the identification and extraction of the third party construction image, and solves the problems of image blurring, multiple noise points and the like caused by noise interference in the image processing. A gray image dataset is finally obtained, which serves as the actual dataset. And dividing the data set into a training set and a testing set, wherein 80% is the training set and 20% is the testing set.

As an alternative embodiment, the categories of the third party construction threat include: the construction equipment comprises a safety helmet, a reflective coating, a pneumatic pick, a soil pile, a construction fence, a water horse, a construction warning sign, a bulldozer, an excavator, a road roller, a spade, an earthmoving vehicle, a roadblock and an engineering protective fence.

The present embodiment gives the class of third party construction threats. The present example lists 14 threat categories in total, all belonging to the objects required for construction, such as excavators, bulldozers, etc., and also representative objects for third party construction, such as water horses.

As an alternative embodiment, a gaussian filter based on convolution operation is used to denoise the image sample, and the convolution formula of the gaussian kernel is:

The embodiment provides a technical scheme of image noise reduction. In the embodiment, a gaussian filter based on convolution operation is adopted to perform denoising processing on the image sample, and a convolution formula of a gaussian kernel is shown as a formula (1). Most image noise belongs to gaussian noise, so gaussian filters are widely used for image denoising. Gaussian filtering is a linear smoothing filter suitable for removing gaussian noise. The technical principle of gaussian filtering can be simply understood that gaussian filtering denoising is to perform weighted average on the pixel value of the whole image, and the value of each pixel point is obtained by performing weighted average on the value of the pixel point and other pixel values in the adjacent domain.

As an alternative embodiment, the RGB2gray functions in skimage image processing libraries are used to convert the color RGB images to gray scale.

The present embodiment provides a technical solution for converting a color RGB image into a gray scale. The embodiment uses the rgb2gray function in skimage to realize gray map conversion, and uses the python language to import the rgb2gray function from the skimage image data processing library in the following specific manner:

from skimage.color import rgb2gray。

As an alternative embodiment, the object detection model includes an input, a backbone network backbone, a connection network stack, and an output.

The embodiment provides a specific network structure of the target detection model. As shown in fig. 2, the target detection model mainly comprises an input end, a backbone network backbone, a connection network stack and an output end. Each of the sections is described separately below.

(A) Input terminal

The input end performs random scaling, cutting and other treatments on the input image, so as to improve the sample size of the data set and the effect of training the model; in addition, the input end adaptively adds the least black edge to the image according to the size of the image so as to realize the uniformity of the size of the image of the whole data set.

(B) Backbone network

The Focus structure in the backbone network is responsible for slicing operation of the pictures, the CSP structure is responsible for feature map convolution, learning effect is improved, and calculation complexity is reduced.

(C) Neck network

And up-sampling the characteristic information from top to bottom by using the FPN structure in Neck networks, fusing the characteristic information of the high layer and the low layer obtained by sampling, and finally obtaining a predicted characteristic diagram by calculation. And the PAN is opposite to the FPN, the characteristic information is downsampled from bottom to top, the characteristic information of the lower layer and the characteristic information of the upper layer obtained by the sampling are fused, and finally, a predicted characteristic diagram is obtained by calculation.

(D) An output terminal

The Bounding box Loss function at the output end adopts CIoU _loss function, and NMS (Non-Maximum Suppression ) is responsible for screening prediction frames, and removing prediction frames with high IOU and low confidence. The output end is responsible for predicting an input image, generating a detection frame with higher confidence coefficient and a label thereof (comprising category and confidence coefficient), and outputting a detection result.

As an alternative embodiment, when training the target detection model, data enhancement processing is performed on the input image samples, and the 4 input image samples are spliced after random scaling, random clipping and random arrangement.

The embodiment provides a technical scheme for data enhancement. The data enhancement is mainly applied to a model training stage and is used for increasing the number of training sample data, so that the training precision of a model is improved. In the embodiment, the (every) 4 image samples are subjected to random scaling, random clipping and random arrangement and then are spliced, so that different image samples with the number much greater than 4 are obtained.

As an alternative embodiment, the method further comprises: when training the target detection model, automatically screening the target detection frame according to the following steps:

Wherein, a_i is the ith target detection frame, B_j is the jth alternative frame, the alternative frames are marked target frames, a_i∩B_j represents the intersection area of a_i、B_j, a_i∪B_j represents the union area of a_i、B_j, j=1, 2, …, M are the number of alternative frames;

The score of each target detection frame is calculated as follows:

As an alternative embodiment, the method further comprises:

Calculating the position of the target detection frame A_i:

The embodiment provides a technical scheme for determining the central position coordinates of the target detection frame A_i. A_i is a target detection frame which remains after screening, and a_i itself has a central position coordinate, but the error is larger. For this purpose, the embodiment obtains the center position coordinate of a_i by solving the candidate box matched with the candidate box. In this embodiment, all B_j, i.e. B_jk, satisfying s_i×IOU(A_i,B_j)＞S₀ are regarded as candidate boxes matched with a_i, and weighted average is calculated on the central position coordinates of all B_jk to obtain the central position coordinates of a_i, where the calculation formula is as formula (4), and the weighting coefficients are the IOUs of a_i and B_jk. When the number k=1 of B_jk, that is, only one candidate frame, the center position coordinate of the candidate frame is the center position coordinate of a_i.

Fig. 3 is a schematic diagram of a third party construction threat object detection apparatus according to an embodiment of the invention, the apparatus comprising:

The model training module 11 is used for establishing a training data set based on images acquired at a third party construction site and training a target detection model;

the feature extraction module 12 is configured to input a video image into the trained target detection model for feature extraction, so as to obtain three image features with different sizes;

The target recognition module 13 is used for performing target recognition based on three image features with different sizes and outputting the category of the third party construction threat.

The device of this embodiment may be used to implement the technical solution of the method embodiment shown in fig. 1, and its implementation principle and technical effects are similar, and are not described here again.

The foregoing is merely illustrative of the present invention, and the present invention is not limited thereto, and any changes or substitutions easily contemplated by those skilled in the art within the scope of the present invention should be included in the present invention. Therefore, the protection scope of the invention is subject to the protection scope of the claims.

Claims

1. The third party construction threat detection method is characterized by comprising the following steps:

performing target recognition based on three image features with different sizes, and outputting the category of the third party construction threat;

the target detection model comprises an input end, a backbone network backup, a connecting network stack and an output end;

When training the target detection model, automatically screening the target detection frame according to the following steps:

the score S_i of each target detection frame is calculated as follows:

if S_i>S₀, reserving the target detection frame A_i,S₀ as a set threshold; all B_j satisfying s_i×IOU(A_i,B_j)＞S₀ are counted as B_jk, k=1, 2, …, K being the number of B_jk; the position of the target detection frame a_i is calculated as follows:

wherein A_i-O is the central position coordinate of the target detection frame A_i, and B_jk-O is the central position coordinate of the candidate frame B_jk;

if S_i≤S₀, deleting the target detection frame A_i.

2. The third party construction threat detection method of claim 1, wherein the establishing a training data set based on the image collected at the third party construction site comprises:

cleaning the selected image samples, and removing some unclear image samples;

denoising the marked image sample;

3. The third party construction threat detection method of claim 2, wherein the class of third party construction threats comprises: the construction equipment comprises a safety helmet, a reflective coating, a pneumatic pick, a soil pile, a construction fence, a water horse, a construction warning sign, a bulldozer, an excavator, a road roller, a spade, an earthmoving vehicle, a roadblock and an engineering protective fence.

4. The third party construction threat detection method of claim 2, wherein the image sample is denoised using a gaussian filter based on a convolution operation, and a convolution formula of the gaussian kernel is:

5. A third party construction threat detection method in accordance with claim 2, wherein the color RGB images are converted to gray scale using an RGB2gray function in a skimage image processing library.

6. The third party construction threat detection method of claim 1, wherein when training the target detection model, the input image samples are subjected to data enhancement processing, and the 4 input image samples are spliced after being subjected to random scaling, random clipping and random arrangement.

7. A third party construction threat detection apparatus, comprising:

the target recognition module is used for carrying out target recognition based on three image features with different sizes and outputting the category of the third party construction threat;

the score S_i of each target detection frame is calculated as follows:

If S_i＞S₀, reserving the target detection frame A_i,S₀ as a set threshold; all B_j satisfying s_i×IOU(A_i,B_j)＞S₀ are counted as B_jk, k=1, 2, …, K being the number of B_jk; the position of the target detection frame a_i is calculated as follows:

if S_i≤S₀, deleting the target detection frame A_i.