Movatterモバイル変換


[0]ホーム

URL:


CN109583456B - Infrared surface target detection method based on feature fusion and dense connection - Google Patents

Infrared surface target detection method based on feature fusion and dense connection
Download PDF

Info

Publication number
CN109583456B
CN109583456BCN201811386234.9ACN201811386234ACN109583456BCN 109583456 BCN109583456 BCN 109583456BCN 201811386234 ACN201811386234 ACN 201811386234ACN 109583456 BCN109583456 BCN 109583456B
Authority
CN
China
Prior art keywords
target
feature
bounding box
network
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811386234.9A
Other languages
Chinese (zh)
Other versions
CN109583456A (en
Inventor
周慧鑫
施元斌
赵东
郭立新
张嘉嘉
秦翰林
王炳健
赖睿
李欢
宋江鲁奇
姚博
于跃
贾秀萍
周峻
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xidian University
Original Assignee
Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xidian UniversityfiledCriticalXidian University
Priority to CN201811386234.9ApriorityCriticalpatent/CN109583456B/en
Publication of CN109583456ApublicationCriticalpatent/CN109583456A/en
Application grantedgrantedCritical
Publication of CN109583456BpublicationCriticalpatent/CN109583456B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Images

Classifications

Landscapes

Abstract

Translated fromChinese

本发明公开了一种基于特征融合和稠密连接的红外面目标检测方法,构建包含所需识别目标的红外图像数据集,在所述红外图像数据集中标定所需识别目标的位置与种类,获得原有已知的标签图像;将所述红外图像数据集分为训练集和验证集两部分;对训练集中的图像进行图像增强的预处理并且进行特征提取和特征融合,通过回归网络获得分类结果和边界框;将所述分类结果和边界框与原有已知的标签图像进行损失函数计算,更新卷积神经网络的参数值;重复对卷积神经网络参数进行迭代更新,直至误差足够小或迭代次数达到设定的上限为止;通过训练完成的卷积神经网络参数对验证集中的图像进行处理,获取目标检测的准确度和所需时间,以及最终目标检测结果图。

Figure 201811386234

The invention discloses an infrared surface target detection method based on feature fusion and dense connection, constructs an infrared image data set containing the required recognition target, calibrates the position and type of the required recognition target in the infrared image data set, and obtains the original There are known label images; the infrared image data set is divided into two parts, a training set and a verification set; the images in the training set are preprocessed for image enhancement and feature extraction and feature fusion, and the classification results and Bounding box; calculate the loss function of the classification result and the bounding box and the original known label image, and update the parameter value of the convolutional neural network; iteratively update the parameters of the convolutional neural network until the error is small enough or iteratively Until the number of times reaches the set upper limit; the images in the verification set are processed through the trained convolutional neural network parameters to obtain the accuracy and time required for target detection, as well as the final target detection result map.

Figure 201811386234

Description

Translated fromChinese
基于特征融合和稠密连接的红外面目标检测方法Infrared surface target detection method based on feature fusion and dense connection

技术领域technical field

本发明属于图像处理技术领域,具体涉及一种基于特征融合和稠密连接的红外面目标检测方法。The invention belongs to the technical field of image processing, and in particular relates to an infrared surface target detection method based on feature fusion and dense connection.

背景技术Background technique

目前,主要的目标检测方法可以大致分为两类,一类是基于背景建模的目标检测方法,一类是基于前景建模的方法,基于背景建模的方法通过构建背景模型,将图像中与背景差异大的区域判定为目标;由于背景的复杂性,此种方法的检测效果不够理想。基于前景建模的方法通过提取目标的特征信息,将与特征信息相符较多的区域判定为目标,其中,最具代表性的是基于深度学习的目标检测方法。基于深度学习的目标检测方法通过深层卷积神经网络,自动提取目标特征,检测目标种类与位置。然后与训练集中的标定信息进行对比,计算损失函数,通过梯度下降的方法,改进网络提取的特征,使其更符合目标的实际情况。同时,更新后续检测部分的参数,使检测结果更准确。不断重复训练,直到达到预期的检测效果。At present, the main target detection methods can be roughly divided into two categories, one is the target detection method based on background modeling, and the other is the method based on foreground modeling. The area with a large difference from the background is determined as the target; due to the complexity of the background, the detection effect of this method is not ideal. The method based on foreground modeling extracts the feature information of the target, and judges the area that matches the feature information more as the target. Among them, the most representative one is the target detection method based on deep learning. The target detection method based on deep learning automatically extracts target features through a deep convolutional neural network, and detects the type and location of the target. Then compare it with the calibration information in the training set, calculate the loss function, and improve the features extracted by the network through the gradient descent method to make it more in line with the actual situation of the target. At the same time, the parameters of the subsequent detection part are updated to make the detection results more accurate. Repeat the training until the expected detection effect is achieved.

发明内容Contents of the invention

为了解决现有技术中的上述问题,本发明提供了一种基于特征融合和稠密块的目标检测方法。In order to solve the above-mentioned problems in the prior art, the present invention provides an object detection method based on feature fusion and dense blocks.

本发明采用的技术方案如下:The technical scheme that the present invention adopts is as follows:

本发明实施例提供一种基于特征融合和稠密连接的红外面目标检测方法,该方法通过如下步骤实现:An embodiment of the present invention provides an infrared surface target detection method based on feature fusion and dense connection, which is implemented through the following steps:

步骤1,构建包含所需识别目标的红外图像数据集,在所述红外图像数据集中标定所需识别目标的位置与种类,获得原有已知的标签图像;Step 1, constructing an infrared image data set containing the target to be identified, marking the position and type of the target to be identified in the infrared image data set, and obtaining the original known label image;

步骤2,将所述红外图像数据集分为训练集和验证集两部分;Step 2, the infrared image data set is divided into two parts, a training set and a verification set;

步骤3,对训练集中的图像进行图像增强的预处理;Step 3, carry out the preprocessing of image enhancement to the image in the training set;

步骤4,对预处理后的图像进行特征提取和特征融合,并通过回归网络获得分类结果和边界框;将所述分类结果和边界框与原有已知的标签图像进行损失函数计算,使用包含动量的随机梯度下降法在卷积神经网络中对预测误差进行反向传播,并更新卷积神经网络的参数值;Step 4, perform feature extraction and feature fusion on the preprocessed image, and obtain the classification result and bounding box through the regression network; perform loss function calculation on the classification result and bounding box and the original known label image, using the The stochastic gradient descent method of momentum backpropagates the prediction error in the convolutional neural network and updates the parameter values of the convolutional neural network;

步骤5,重复步骤3、4对卷积神经网络参数进行迭代更新,直至误差足够小或迭代次数达到设定的上限为止;Step 5, repeat steps 3 and 4 to iteratively update the parameters of the convolutional neural network until the error is small enough or the number of iterations reaches the set upper limit;

步骤6,通过训练完成的卷积神经网络参数对验证集中的图像进行处理,获取目标检测的准确度和所需时间,以及最终目标检测结果图。Step 6: Process the images in the verification set through the trained convolutional neural network parameters to obtain the accuracy and time required for target detection, as well as the final target detection result map.

上述方案中,所述步骤4中对预处理后的图像进行特征提取和特征融合,并通过回归网络获得分类结果和边界框,具体通过以下步骤实现:In the above scheme, in the step 4, feature extraction and feature fusion are performed on the preprocessed image, and the classification result and the bounding box are obtained through the regression network, which is specifically implemented through the following steps:

步骤401,在所述训练集中随机抽取固定数量的图像,对每一幅图像划分10×10的区域;Step 401, randomly extracting a fixed number of images from the training set, and dividing each image into a 10×10 area;

步骤402,将所述步骤401划分后的图像输入稠密连接网络进行特征提取;Step 402, input the image divided in step 401 into a densely connected network for feature extraction;

步骤403,对提取的特征图进行特征融合,获得融合的特征图;Step 403, performing feature fusion on the extracted feature map to obtain a fused feature map;

步骤404,对所述融合的特征图中每一个区域产生固定数量的建议框;Step 404, generating a fixed number of suggestion boxes for each region in the fused feature map;

步骤405,将所述融合的特征图和建议框送入回归网络进行分类和边界框回归,并使用非极大值抑制方法去除冗余,获得分类结果和边界框。Step 405, send the fused feature map and suggestion frame to the regression network for classification and bounding box regression, and use the non-maximum value suppression method to remove redundancy, and obtain the classification result and bounding box.

上述方案中,所述步骤402中稠密连接网络的计算方法如公式:In the above scheme, the calculation method of the densely connected network in step 402 is as follows:

dl=Hl([d0,d1,...,dl-1])dl =Hl ([d0 ,d1 ,...,dl-1 ])

其中,dl表示稠密连接网络中第l个卷积层的输出结果,若稠密连接网络共包含B个卷积层,则l在0~B之间取值,Hl(*)是正则化、卷积和线性整流激活函数的组合操作,d0为输入图像,dl-1为第l-1层的输出结果。Among them, dl represents the output result of the lth convolutional layer in the densely connected network. If the densely connected network contains B convolutional layers, then l takes a value between 0 and B, and Hl (*) is the regularization , the combined operation of convolution and linear rectification activation functions, d0 is the input image, and dl-1 is the output result of the l-1th layer.

上述方案中,所述步骤403中对提取的特征图进行特征融合是将所提取到的不同尺度的特征图通过池化方法进行直接融合。In the above solution, the feature fusion of the extracted feature maps in the step 403 is to directly fuse the extracted feature maps of different scales through a pooling method.

上述方案中,所述步骤403中对提取的特征图进行特征融合,具体通过以下步骤实现:In the above solution, feature fusion is performed on the extracted feature maps in step 403, which is specifically implemented through the following steps:

步骤4031,将第一组特征图F1通过池化运算,转换成新的较小的特征图,再与第二组特征图F2融合得到新的特征图F2’;Step 4031, convert the first group of feature mapsF1 into a new smaller feature map through pooling operation, and then fuse with the second group of feature mapsF2 to obtain a new feature mapF2 ';

步骤4032,将新的特征图F2’通过池化运算,再与第三组特征图F3融合得到新的特征图F3’;In step 4032, the new feature map F2 ' is pooled, and then fused with the third group of feature maps F3 to obtain a new feature map F3 ';

步骤4033,用新的特征图F2’和F3’代替第二组特征图F2和第三组特征图F3进入回归网络。Step 4033, replace the second group of feature maps F2 and the third group of feature maps F3 with new feature maps F2 ′ and F3 ′ to enter the regression network.

上述方案中,所述步骤405中将所述融合的特征图和建议框送入回归网络进行分类和边界框回归,并使用非极大值抑制方法去除冗余,获得分类结果和边界框,具体通过以下步骤实现:In the above scheme, in the step 405, the fused feature map and suggestion frame are sent to the regression network for classification and bounding box regression, and the non-maximum value suppression method is used to remove redundancy, and the classification result and bounding box are obtained, specifically This is achieved through the following steps:

步骤4051,将特征图划分为10×10个区域,输入回归检测网络;Step 4051, divide the feature map into 10×10 regions, and input them into the regression detection network;

步骤4051,对于每一个区域,回归检测网络将输出7个可能存在的目标的位置与种类;其中,目标种类共有A个,即输出对应A种目标的可能性,与训练集的设置有关;位置参数包含3个数据,包括目标边界框的中心位置坐标、宽、高;Step 4051, for each area, the regression detection network will output the positions and types of 7 possible targets; among them, there are A total of target types, that is, the possibility of outputting the corresponding A type of target is related to the setting of the training set; the position The parameter contains 3 data, including the center position coordinates, width and height of the target bounding box;

步骤4052,非极大值抑制方法是对于获得的同一种类边界框,使用以下公式计算其交并比:Step 4052, the non-maximum value suppression method is to calculate the intersection and union ratio of the obtained bounding box of the same type using the following formula:

Figure BDA0001873022370000031
Figure BDA0001873022370000031

其中,S为计算所得的交并比,M,N表示同一类目标的两个边界框,M∩N表示边界框M与N的交集,M∪N表示边界框M与N的并集。对于S大于0.75的两个边界框,剔除其中分类结果值较小的边界框。Among them, S is the calculated intersection ratio, M and N represent two bounding boxes of the same type of objects, M∩N represents the intersection of bounding boxes M and N, and M∪N represents the union of bounding boxes M and N. For two bounding boxes with S greater than 0.75, the bounding box with the smaller classification result value is eliminated.

上述方案中,所述步骤4中将所述分类结果和边界框与原有已知的标签图像进行损失函数计算,使用包含动量的随机梯度下降法在卷积神经网络中对预测误差进行反向传播,并更新卷积神经网络的参数值,具体通过以下步骤实现:In the above scheme, in the step 4, the loss function calculation is performed on the classification result and the bounding box and the original known label image, and the prediction error is reversed in the convolutional neural network using the stochastic gradient descent method including momentum. Propagate and update the parameter values of the convolutional neural network, specifically through the following steps:

步骤401,根据所述分类结果和边界框中目标的位置与种类以及训练集中标定的所需识别目标的位置与种类计算损失函数,损失函数的计算公式如下所示:Step 401, calculate the loss function according to the classification result, the position and type of the object in the bounding box, and the position and type of the target to be recognized in the training set. The calculation formula of the loss function is as follows:

Figure BDA0001873022370000041
Figure BDA0001873022370000041

其中,100为区域数量,7为每个区域需要预测的建议框和最终生成的边界框数量,i为区域编号,j为建议框和边界框编号,loss为误差值,obj表示存在目标,noobj表示不存在目标,x和y分别为建议框和边界框中心的横坐标和纵坐标的预测值,w和h分别为建议框和边界框的宽和高的预测值,C为建议框和边界框是否包含目标的预测值,包含A个值,分别对应A类目标的可能性,

Figure BDA0001873022370000042
为对应的标注值,
Figure BDA0001873022370000043
Figure BDA0001873022370000044
分别表示目标落入和未落入区域i的第j个建议框和边界框内;Among them, 100 is the number of regions, 7 is the number of proposal boxes that need to be predicted for each region and the number of final bounding boxes, i is the number of regions, j is the number of proposal boxes and bounding boxes, loss is the error value, obj indicates the existence of the target, and noobj Indicates that there is no target, x and y are the predicted values of the abscissa and ordinate of the center of the proposal box and the bounding box, respectively, w and h are the predicted values of the width and height of the proposal box and the bounding box, respectively, and C is the proposal box and the boundary Whether the box contains the predicted value of the target, including A values, corresponding to the possibility of the A-type target,
Figure BDA0001873022370000042
is the corresponding labeled value,
Figure BDA0001873022370000043
and
Figure BDA0001873022370000044
Indicates that the target falls into and does not fall into the j-th proposal box and bounding box of region i, respectively;

步骤402,根据损失函数计算结果,使用包含动量的随机梯度下降法对权重进行更新。Step 402, according to the calculation result of the loss function, use the stochastic gradient descent method including momentum to update the weights.

上述方案中,所述步骤3的预处理为通过随机旋转、镜像、翻转、缩放、平移、尺度变换、对比度变换、噪声扰动和颜色变化扩充训练集。In the above solution, the preprocessing of step 3 is to expand the training set by random rotation, mirroring, flipping, scaling, translation, scale transformation, contrast transformation, noise perturbation and color change.

与现有的技术相比,本发明通过对红外图像进行学习,使目标检测网络获得对可见光与红外目标的识别能力,同时,通过改进网络结构,使本方法相对传统深度学习方法具有更好的检测效果。Compared with the existing technology, the present invention enables the target detection network to obtain the ability to recognize visible light and infrared targets by learning infrared images, and at the same time, by improving the network structure, this method has better performance than traditional deep learning methods. Detection effect.

附图说明Description of drawings

图1为本发明的流程图;Fig. 1 is a flowchart of the present invention;

图2为本发明的网络结构图;Fig. 2 is a network structure diagram of the present invention;

图3为本发明的结果图。Fig. 3 is the result graph of the present invention.

具体实施方式Detailed ways

为了使本发明的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本发明进行进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

本发明实施例提供一种基于特征融合和稠密连接的红外面目标检测方法,如图1所示,该方法通过以下步骤实现:An embodiment of the present invention provides an infrared surface target detection method based on feature fusion and dense connection, as shown in Figure 1, the method is implemented through the following steps:

步骤1、构建数据集Step 1. Build a dataset

如果需要检测算法具有对红外图像进行识别的能力,需要在数据集中加入红外图像。本发明使用红外图像构建数据集,使用边界框对数据集中图像进行人工标记。If the detection algorithm is required to have the ability to recognize infrared images, it is necessary to add infrared images to the data set. The invention uses infrared images to construct a data set, and uses bounding boxes to manually mark the images in the data set.

步骤2、扩充训练集Step 2. Expand the training set

通过随机旋转、镜像、翻转、缩放、平移、尺度变换、对比度变换、噪声扰动和颜色变化等方法,扩充训练集。可以弥补数据集采集困难的缺点,提高小数据集的训练效果。Augment the training set with methods such as random rotation, mirroring, flipping, scaling, translation, scale transformation, contrast transformation, noise perturbation, and color variation. It can make up for the shortcomings of difficult data set collection and improve the training effect of small data sets.

步骤3、划分10*10区域Step 3. Divide 10*10 area

将原图像划分为10*10的区域,每个区域分别负责检查中心落入该区域的目标,可以大大加快检测速度。Divide the original image into 10*10 areas, and each area is responsible for checking the objects whose center falls into this area, which can greatly speed up the detection speed.

步骤4、使用稠密网络进行特征提取Step 4. Use dense network for feature extraction

特征提取过程包含以下步骤:The feature extraction process consists of the following steps:

第一步,使用卷积核大小为3*3,数量为32的卷积层对输入图像进行计算,然后进行2*2的池化运算,得到特征图F1In the first step, the input image is calculated using a convolutional layer with a convolution kernel size of 3*3 and a number of 32, and then a 2*2 pooling operation is performed to obtain a feature map F1 .

第二步,使用包含64个3*3卷积核与64个1*1卷积核的稠密块对F1进行特征提取,同时计算残差,然后进行2*2的池化运算,得到特征图F2In the second step, use a dense block containing 64 3*3 convolution kernels and 64 1*1 convolution kernels to perform feature extraction on F1 , calculate the residual at the same time, and then perform a 2*2 pooling operation to obtain the features FigureF2 .

第三步,使用包含64个1*1卷积核与64个3*3卷积核的稠密块对F2进行特征提取,同时计算残差,然后进行2*2的池化运算,得到特征图F3The third step is to use a dense block containing 64 1*1 convolution kernels and 64 3*3 convolution kernels to perform feature extraction on F2 , calculate the residual at the same time, and then perform a 2*2 pooling operation to obtain the features FigureF3 .

第四步,使用包含64个1*1卷积核与64个3*3卷积核的稠密块对F4进行特征提取,然后进行1*1的卷积,同时计算残差,最后进行2*2的池化运算,得到特征图F4The fourth step is to use a dense block containing 64 1*1 convolution kernels and 64 3*3 convolution kernels to perform feature extraction on F4 , then perform 1*1 convolution, calculate the residual at the same time, and finally perform 2 *2 pooling operation to obtain the feature map F4 .

第五步,使用包含256个1*1卷积核与256个3*3卷积核的稠密块对F4进行特征提取,然后进行1*1的卷积,同时计算残差,最后进行2*2的池化运算,得到特征图F5The fifth step is to use a dense block containing 256 1*1 convolution kernels and 256 3*3 convolution kernels to perform feature extraction on F4 , then perform 1*1 convolution, calculate the residual at the same time, and finally perform 2 *2 pooling operation to obtain the feature map F5 .

第六步,使用包含1024个1*1卷积核、1024个3*3卷积核和1024个1*1卷积核的稠密块对F5进行特征提取,然后进行1*1的卷积,同时计算残差,得到特征图F6In the sixth step, use a dense block containing 1024 1*1 convolution kernels, 1024 3*3 convolution kernels and 1024 1*1 convolution kernels to perform feature extraction on F5 , and then perform 1*1 convolution , and calculate the residual at the same time to obtain the feature map F6 .

步骤5、对特征提取结果进行特征融合Step 5. Perform feature fusion on the feature extraction results

特征融合的方法包含以下步骤:The method of feature fusion includes the following steps:

第一步,提取步骤3中所得的特征图F4、F5、F6The first step is to extract the feature maps F4 , F5 , and F6 obtained in step 3.

第二步,对特征图F4进行4次2*2池化,分别取四领域中左上,右上,左下,右下的点,形成新的特征图F4’,与特征图F5组合成特征图组F7The second step is to perform 4 times of 2*2 pooling on the feature map F4 , respectively take the upper left, upper right, lower left, and lower right points in the four fields to form a new feature map F4 ′, which is combined with the feature map F5 Feature map group F7 .

第三步,对特征图F7进行4次2*2池化,分别取四领域中左上,右上,左下,右下的点,形成新的特征图F7’,与特征图F6组合成特征图组F8The third step is to perform 4 times of 2*2 pooling on the feature map F7 , respectively take the upper left, upper right, lower left, and lower right points in the four fields to form a new feature map F7 ′, which is combined with the feature map F6 Feature map group F8 .

步骤6、回归检测得到分类结果和边界框Step 6. Regression detection to obtain classification results and bounding boxes

得到分类结果和边界框的方法如下:对于每一个区域,分类和回归检测网络将输出7个可能所存在的目标的位置与种类。其中,目标种类共有A个,即输出对应A种目标的可能性,与训练集的设置有关;位置参数包含3个数据,包括目标边界框的中心位置坐标、宽、高;The method of obtaining classification results and bounding boxes is as follows: for each region, the classification and regression detection network will output the positions and types of 7 possible targets. Among them, there are a total of A types of targets, that is, the possibility of outputting corresponding targets of A type is related to the setting of the training set; the position parameter contains 3 data, including the center position coordinates, width, and height of the target bounding box;

步骤7、计算损失函数与更新参数Step 7. Calculate loss function and update parameters

根据第6步输出的目标的位置与种类与训练集中标定的所需识别目标的位置与种类计算损失函数,此步骤只在训练过程中进行。损失函数的计算公式如下所示:Calculate the loss function according to the position and type of the target output in step 6 and the position and type of the target to be recognized in the training set. This step is only performed during the training process. The calculation formula of the loss function is as follows:

Figure BDA0001873022370000061
Figure BDA0001873022370000061

其中,100为区域数量,7为每个区域需要预测的建议框和最终生成的编辑框数量,i为区域编号,j为建议框和边界框编号,loss为误差值,obj表示存在目标,noobj表示不存在目标。x和y分别为建议框和边界框中心的横坐标和纵坐标的预测值,w和h分别为建议框和边界框的宽和高的预测值,C为建议框和边界框是否包含目标的预测值,包含A个值,分别对应A类目标的可能性,

Figure BDA0001873022370000071
为对应的标注值,
Figure BDA0001873022370000072
Figure BDA0001873022370000073
分别表示目标落入和未落入区域i的第j个建议框和边界框内。然后,根据损失函数计算结果,使用包含动量的随机梯度下降法对权重进行更新。Among them, 100 is the number of regions, 7 is the number of suggestion boxes that need to be predicted for each region and the number of final edit boxes, i is the number of regions, j is the number of suggestion boxes and bounding boxes, loss is the error value, obj indicates the existence of the target, and noobj Indicates that no target exists. x and y are the predicted values of the abscissa and ordinate of the center of the proposal box and the bounding box, respectively, w and h are the predicted values of the width and height of the proposal box and the bounding box, respectively, and C is whether the proposal box and the bounding box contain the target Predicted value, including A values, respectively corresponding to the possibility of class A targets,
Figure BDA0001873022370000071
is the corresponding labeled value,
Figure BDA0001873022370000072
and
Figure BDA0001873022370000073
denote that the object falls and does not fall within the j-th proposal box and bounding box of region i, respectively. Then, according to the loss function calculation results, the weights are updated using the stochastic gradient descent method including momentum.

重复步骤3-7直到误差满足要求或迭代次数达到设定的上限。Repeat steps 3-7 until the error meets the requirements or the number of iterations reaches the set upper limit.

步骤8、使用测试集进行测试Step 8. Test with the test set

使用步骤7训练完成的目标检测网络对验证集中的图像进行处理,获取目标检测的准确度和所需时间,以及最终目标检测结果图。Use the target detection network trained in step 7 to process the images in the verification set to obtain the accuracy and time required for target detection, as well as the final target detection result map.

下面结合图2对本发明的网络结构做进一步说明Below in conjunction with Fig. 2, the network structure of the present invention is further described

1、网络层数设置1. Network layer settings

本发明所使用的神经网络分两部分,第一部分为特征提取网络,由5个稠密块组成,共包含25层卷积神经网络。第二部分为特征融合及回归检测网络,包含8层卷积神经网络及1层全卷积网络。The neural network used in the present invention is divided into two parts. The first part is a feature extraction network, which is composed of 5 dense blocks and contains a total of 25 layers of convolutional neural networks. The second part is feature fusion and regression detection network, including 8-layer convolutional neural network and 1-layer full convolutional network.

2、稠密块设置2. Dense block setting

特征提取网络部分所使用稠密块设置如下:The dense block settings used in the feature extraction network part are as follows:

(1)稠密块1包含2层卷积神经网络,第一层所使用卷积核数量为64,大小为1*1,步长为1;第二层所使用卷积核数量为64,大小为3*3,步长为1。稠密块1使用1次。(1) Dense block 1 contains 2 layers of convolutional neural network. The number of convolution kernels used in the first layer is 64, the size is 1*1, and the step size is 1; the number of convolution kernels used in the second layer is 64, and the size is is 3*3, and the step size is 1. Dense Block 1 is used once.

(2)稠密块2包含2层卷积神经网络,第一层所使用卷积核数量为64,大小为3*3,步长为1;第二层所使用卷积核数量为64,大小为1*1,步长为1。稠密块2使用1次。(2) Dense block 2 contains 2 layers of convolutional neural networks. The number of convolution kernels used in the first layer is 64, the size is 3*3, and the step size is 1; the number of convolution kernels used in the second layer is 64, and the size is is 1*1, and the step size is 1. Dense Block 2 is used once.

(3)稠密块3包含2层卷积神经网络,第一层所使用卷积核数量为64,大小为1*1,步长为1;第二层所使用卷积核数量为64,大小为3*3,步长为1。稠密块3使用2次。(3) The dense block 3 contains 2 layers of convolutional neural networks. The number of convolution kernels used in the first layer is 64, the size is 1*1, and the step size is 1; the number of convolution kernels used in the second layer is 64, and the size is is 3*3, and the step size is 1. Dense block 3 uses 2 times.

(4)稠密块4包含2层卷积神经网络,第一层所使用卷积核数量为256,大小为1*1,步长为1;第二层所使用卷积核数量为256,大小为3*3,步长为1。稠密块4使用4次。(4) The dense block 4 contains 2 layers of convolutional neural networks. The number of convolution kernels used in the first layer is 256, the size is 1*1, and the step size is 1; the number of convolution kernels used in the second layer is 256, and the size is is 3*3, and the step size is 1. Dense block 4 uses 4 times.

(5)稠密块5包含3层卷积神经网络,第一层所使用卷积核数量为1024,大小为1*1,步长为1;第二层所使用卷积核数量为1024,大小为3*3,步长为1;第三层所使用卷积核数量为1024,大小为1*1,步长为1。稠密块5使用2次。(5) The dense block 5 contains 3 layers of convolutional neural networks. The number of convolution kernels used in the first layer is 1024, the size is 1*1, and the step size is 1; the number of convolution kernels used in the second layer is 1024, and the size is is 3*3, the step size is 1; the number of convolution kernels used in the third layer is 1024, the size is 1*1, and the step size is 1. Dense block 5 uses 2 times.

3、特征融合设置。3. Feature fusion settings.

特征融合所使用的3组特征图来源于特征提取网络的第9层、第18层和第25层结果。然后将生成特征图通过卷积与上采样与浅层特征图结合。所得结果通过3*3卷积层与1*1卷积层进行进一步处理,然后将所得的三组新特征图进行特征融合。The three sets of feature maps used in feature fusion come from the results of the 9th, 18th and 25th layers of the feature extraction network. Then the generated feature map is combined with the shallow feature map through convolution and upsampling. The obtained results are further processed through a 3*3 convolutional layer and a 1*1 convolutional layer, and then the obtained three sets of new feature maps are subjected to feature fusion.

下面结合图3对本发明的仿真效果做进一步说明。The simulation effect of the present invention will be further described below in conjunction with FIG. 3 .

1.仿真条件:1. Simulation conditions:

本发明的仿真所使用待检测的图像大小为480×640,包含行人和自行车。The size of the image to be detected used in the simulation of the present invention is 480×640, including pedestrians and bicycles.

2.仿真结果与分析:2. Simulation results and analysis:

图3是本发明的结果图,其中,图3(a)为待检测的图;图3(b)为提取得到的特征图;图2(c)为检测结果图。Fig. 3 is the result graph of the present invention, wherein, Fig. 3 (a) is the graph to be detected; Fig. 3 (b) is the feature graph extracted; Fig. 2 (c) is the detection result graph.

使用稠密网络对图3(a)进行特征提取得到一系列特征图,因中间过程的特征图太多,只抽取其中两幅,即图3(b)和图3(c)。其中,图3(b)为较浅层网络提取得到的特征图,图像尺寸较大,含有的细节信息多,语义信息少;图3(c)为较深层网络提取得到的特征图,图像尺寸较小,含有的细节信息少,语义信息多。Using a dense network to extract features from Figure 3(a) to obtain a series of feature maps, because there are too many feature maps in the intermediate process, only two of them are extracted, namely Figure 3(b) and Figure 3(c). Among them, Figure 3(b) is the feature map extracted by the shallower network, the image size is larger, contains more detailed information, and less semantic information; Figure 3(c) is the feature map extracted by the deeper network, the image size Smaller, it contains less detailed information and more semantic information.

对特征图进行融合及回归检测之后,可以得到行人和自行车的位置,将其在原图上进行标注,即得到最终的结果图3(c)。After the fusion and regression detection of the feature maps, the positions of pedestrians and bicycles can be obtained, and marked on the original image, and the final result is shown in Figure 3(c).

以上所述,仅为本发明的较佳实施例而已,并非用于限定本发明的保护范围。The above descriptions are only preferred embodiments of the present invention, and are not intended to limit the protection scope of the present invention.

Claims (2)

Translated fromChinese
1.一种基于特征融合和稠密连接的红外面目标检测方法,其特征在于,该方法通过如下步骤实现:1. An infrared surface target detection method based on feature fusion and dense connection, characterized in that, the method is realized through the following steps:步骤1,构建包含所需识别目标的红外图像数据集,在所述红外图像数据集中标定所需识别目标的位置与种类,获得原有已知的标签图像;Step 1, constructing an infrared image data set containing the target to be identified, marking the position and type of the target to be identified in the infrared image data set, and obtaining the original known label image;步骤2,将所述红外图像数据集分为训练集和验证集两部分;Step 2, the infrared image data set is divided into two parts, a training set and a verification set;步骤3,对训练集中的图像进行图像增强的预处理;Step 3, carry out the preprocessing of image enhancement to the image in the training set;步骤4,对预处理后的图像进行特征提取和特征融合,并通过回归网络获得分类结果和边界框;将所述分类结果和边界框与原有已知的标签图像进行损失函数计算,使用包含动量的随机梯度下降法在卷积神经网络中对预测误差进行反向传播,并更新卷积神经网络的参数值;Step 4, perform feature extraction and feature fusion on the preprocessed image, and obtain the classification result and bounding box through the regression network; perform loss function calculation on the classification result and bounding box and the original known label image, using the The stochastic gradient descent method of momentum backpropagates the prediction error in the convolutional neural network and updates the parameter values of the convolutional neural network;步骤5,重复步骤3、4对卷积神经网络参数进行迭代更新,直至误差足够小或迭代次数达到设定的上限为止;Step 5, repeat steps 3 and 4 to iteratively update the parameters of the convolutional neural network until the error is small enough or the number of iterations reaches the set upper limit;步骤6,通过训练完成的卷积神经网络参数对验证集中的图像进行处理,获取目标检测的准确度和所需时间,以及最终目标检测结果图;Step 6. Process the images in the verification set through the trained convolutional neural network parameters to obtain the accuracy and time required for target detection, as well as the final target detection result map;所述步骤4中对预处理后的图像进行特征提取和特征融合,并通过回归网络获得分类结果和边界框,具体通过以下步骤实现:In the step 4, the preprocessed image is subjected to feature extraction and feature fusion, and a classification result and a bounding box are obtained through a regression network, specifically through the following steps:步骤401,在所述训练集中随机抽取固定数量的图像,对每一幅图像划分10×10的区域;Step 401, randomly extracting a fixed number of images from the training set, and dividing each image into a 10×10 region;步骤402,将所述步骤401划分后的图像输入稠密连接网络进行特征提取;Step 402, input the image divided in step 401 into a densely connected network for feature extraction;步骤403,对提取的特征图进行特征融合,获得融合的特征图;Step 403, performing feature fusion on the extracted feature map to obtain a fused feature map;步骤404,对所述融合的特征图中每一个区域产生固定数量的建议框;Step 404, generating a fixed number of suggestion boxes for each region in the fused feature map;步骤405,将所述融合的特征图和建议框送入回归网络进行分类和边界框回归,并使用非极大值抑制方法去除冗余,获得分类结果和边界框;Step 405, sending the fused feature map and suggestion frame into the regression network for classification and bounding box regression, and using the non-maximum value suppression method to remove redundancy to obtain classification results and bounding boxes;所述步骤402中稠密连接网络的计算方法如公式:The calculation method of the densely connected network in the step 402 is as the formula:dl=Hl([d0,d1,…,dl-1])dl =Hl ([d0 ,d1 ,…,dl-1 ])其中,dl表示稠密连接网络中第l个卷积层的输出结果,若稠密连接网络共包含B个卷积层,则l在0~B之间取值,Hl(*)是正则化、卷积和线性整流激活函数的组合操作,d0为输入图像,dl-1为第l-1层的输出结果;Among them, dl represents the output result of the lth convolutional layer in the densely connected network. If the densely connected network contains B convolutional layers, then l takes a value between 0 and B, and Hl (*) is the regularization , the combined operation of convolution and linear rectification activation functions, d0 is the input image, and dl-1 is the output result of the l-1th layer;所述步骤403中对提取的特征图进行特征融合是将所提取到的不同尺度的特征图通过池化方法进行直接融合;The feature fusion of the extracted feature maps in the step 403 is to directly fuse the extracted feature maps of different scales through a pooling method;所述步骤403中对提取的特征图进行特征融合,具体通过以下步骤实现:In the step 403, feature fusion is carried out on the extracted feature map, specifically through the following steps:步骤4031,将第一组特征图F1通过池化运算,转换成新的较小的特征图,再与第二组特征图F2融合得到新的特征图F2’;Step 4031, convert the first group of feature mapsF1 into a new smaller feature map through pooling operation, and then fuse with the second group of feature mapsF2 to obtain a new feature mapF2 ';步骤4032,将新的特征图F2’通过池化运算,再与第三组特征图F3融合得到新的特征图F3’;In step 4032, the new feature map F2 ' is pooled, and then fused with the third group of feature maps F3 to obtain a new feature map F3 ';步骤4033,用新的特征图F2’和F3’代替第二组特征图F2和第三组特征图F3进入回归网络;Step 4033, replace the second group of feature maps F2 and the third group of feature maps F3 with new feature maps F2 ' and F3 ' to enter the regression network;所述步骤405中将所述融合的特征图和建议框送入回归网络进行分类和边界框回归,并使用非极大值抑制方法去除冗余,获得分类结果和边界框,具体通过以下步骤实现:In the step 405, the fused feature map and suggestion frame are sent to the regression network for classification and bounding box regression, and the non-maximum value suppression method is used to remove redundancy, and the classification result and bounding box are obtained, specifically through the following steps: :步骤4051,将特征图划分为10×10个区域,输入回归检测网络;Step 4051, divide the feature map into 10×10 regions, and input them into the regression detection network;步骤4051,对于每一个区域,回归检测网络将输出7个可能存在的目标的位置与种类;其中,目标种类共有A个,即输出对应A种目标的可能性,与训练集的设置有关;位置参数包含3个数据,包括目标边界框的中心位置坐标、宽、高;Step 4051, for each area, the regression detection network will output the positions and types of 7 possible targets; among them, there are A total of target types, that is, the possibility of outputting the corresponding A type of target is related to the setting of the training set; the position The parameter contains 3 data, including the center position coordinates, width and height of the target bounding box;步骤4052,非极大值抑制方法是对于获得的同一种类边界框,使用以下公式计算其交并比:Step 4052, the non-maximum value suppression method is to calculate the intersection and union ratio of the obtained bounding box of the same type using the following formula:
Figure FDA0004086609070000021
Figure FDA0004086609070000021
其中,S为计算所得的交并比,M,N表示同一类目标的两个边界框,M∩N表示边界框M与N的交集,M∪N表示边界框M与N的并集,对于S大于0.75的两个边界框,剔除其中分类结果值较小的边界框;Among them, S is the calculated intersection ratio, M, N represent two bounding boxes of the same type of target, M∩N represents the intersection of bounding boxes M and N, M∪N represents the union of bounding boxes M and N, for For two bounding boxes with S greater than 0.75, remove the bounding box with a smaller classification result value;所述步骤4中将所述分类结果和边界框与原有已知的标签图像进行损失函数计算,使用包含动量的随机梯度下降法在卷积神经网络中对预测误差进行反向传播,并更新卷积神经网络的参数值,具体通过以下步骤实现:In the step 4, the classification result, the bounding box and the original known label image are used to calculate the loss function, and the stochastic gradient descent method including momentum is used to backpropagate the prediction error in the convolutional neural network, and update The parameter value of the convolutional neural network is realized through the following steps:步骤401,根据所述分类结果和边界框中目标的位置与种类以及训练集中标定的所需识别目标的位置与种类计算损失函数,损失函数的计算公式如下所示:Step 401, calculate the loss function according to the classification result, the position and type of the object in the bounding box, and the position and type of the target to be recognized in the training set. The calculation formula of the loss function is as follows:
Figure FDA0004086609070000031
Figure FDA0004086609070000031
其中,100为区域数量,7为每个区域需要预测的建议框和最终生成的边界框数量,i为区域编号,j为建议框和边界框编号,loss为误差值,obj表示存在目标,noobj表示不存在目标,x和y分别为建议框和边界框中心的横坐标和纵坐标的预测值,w和h分别为建议框和边界框的宽和高的预测值,C为建议框和边界框是否包含目标的预测值,包含A个值,分别对应A类目标的可能性,
Figure FDA0004086609070000032
为对应的标注值,
Figure FDA0004086609070000033
Figure FDA0004086609070000034
分别表示目标落入和未落入区域i的第j个建议框和边界框内;
Among them, 100 is the number of regions, 7 is the number of proposal boxes that need to be predicted for each region and the number of final bounding boxes, i is the number of regions, j is the number of proposal boxes and bounding boxes, loss is the error value, obj indicates the existence of the target, and noobj Indicates that there is no target, x and y are the predicted values of the abscissa and ordinate of the center of the proposal box and the bounding box, respectively, w and h are the predicted values of the width and height of the proposal box and the bounding box, respectively, and C is the proposal box and the boundary Whether the box contains the predicted value of the target, including A values, corresponding to the possibility of the A-type target,
Figure FDA0004086609070000032
is the corresponding labeled value,
Figure FDA0004086609070000033
and
Figure FDA0004086609070000034
Indicates that the target falls into and does not fall into the j-th proposal box and bounding box of region i, respectively;
步骤402,根据损失函数计算结果,使用包含动量的随机梯度下降法对权重进行更新。Step 402, according to the calculation result of the loss function, use the stochastic gradient descent method including momentum to update the weights.2.根据权利要求1所述的基于特征融合和稠密连接的红外面目标检测方法,其特征在于,所述步骤3的预处理为通过随机旋转、镜像、翻转、缩放、平移、尺度变换、对比度变换、噪声扰动和颜色变化扩充训练集。2. The infrared surface target detection method based on feature fusion and dense connection according to claim 1, wherein the preprocessing of the step 3 is through random rotation, mirror image, flip, scaling, translation, scale transformation, contrast Transforms, noise perturbations, and color changes augment the training set.
CN201811386234.9A2018-11-202018-11-20 Infrared surface target detection method based on feature fusion and dense connectionActiveCN109583456B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN201811386234.9ACN109583456B (en)2018-11-202018-11-20 Infrared surface target detection method based on feature fusion and dense connection

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201811386234.9ACN109583456B (en)2018-11-202018-11-20 Infrared surface target detection method based on feature fusion and dense connection

Publications (2)

Publication NumberPublication Date
CN109583456A CN109583456A (en)2019-04-05
CN109583456Btrue CN109583456B (en)2023-04-28

Family

ID=65923459

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201811386234.9AActiveCN109583456B (en)2018-11-202018-11-20 Infrared surface target detection method based on feature fusion and dense connection

Country Status (1)

CountryLink
CN (1)CN109583456B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN110197152B (en)*2019-05-282022-08-26南京邮电大学Road target identification method for automatic driving system
CN110532914A (en)*2019-08-202019-12-03西安电子科技大学Building analyte detection method based on fine-feature study
CN111461145B (en)*2020-03-312023-04-18中国科学院计算技术研究所Method for detecting target based on convolutional neural network
US11475249B2 (en)*2020-04-302022-10-18Electronic Arts Inc.Extending knowledge data in machine vision
CN112767354B (en)*2021-01-192025-02-28南京汇川图像视觉技术有限公司 Defect detection method, device, equipment and storage medium based on image segmentation
CN113869165B (en)*2021-09-182024-11-01山东师范大学Traffic scene target detection method and system
CN114255377A (en)*2021-12-022022-03-29青岛图灵科技有限公司Differential commodity detection and classification method for intelligent container
CN119206521B (en)*2024-12-022025-01-28四川农业大学Tea plant diseases and insect pests detection method based on deep learning and computer device

Citations (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN107609525A (en)*2017-09-192018-01-19吉林大学Remote Sensing Target detection method based on Pruning strategy structure convolutional neural networks
CN107808143A (en)*2017-11-102018-03-16西安电子科技大学Dynamic gesture identification method based on computer vision
CN107818302A (en)*2017-10-202018-03-20中国科学院光电技术研究所Non-rigid multi-scale object detection method based on convolutional neural network
CN108038519A (en)*2018-01-302018-05-15浙江大学A kind of uterine neck image processing method and device based on dense feature pyramid network
CN108182456A (en)*2018-01-232018-06-19哈工大机器人(合肥)国际创新研究院A kind of target detection model and its training method based on deep learning

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN107609525A (en)*2017-09-192018-01-19吉林大学Remote Sensing Target detection method based on Pruning strategy structure convolutional neural networks
CN107818302A (en)*2017-10-202018-03-20中国科学院光电技术研究所Non-rigid multi-scale object detection method based on convolutional neural network
CN107808143A (en)*2017-11-102018-03-16西安电子科技大学Dynamic gesture identification method based on computer vision
CN108182456A (en)*2018-01-232018-06-19哈工大机器人(合肥)国际创新研究院A kind of target detection model and its training method based on deep learning
CN108038519A (en)*2018-01-302018-05-15浙江大学A kind of uterine neck image processing method and device based on dense feature pyramid network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Densely Connected Convolutional Networks;Gao Huang 等;《2017 IEEE Conference on Computer Vision and Pattern Recognition》;20171109;全文*
复杂天空背景下的红外弱小目标跟踪;赵东 等;《强激光与粒子束》;20180630;第30卷(第6期);全文*

Also Published As

Publication numberPublication date
CN109583456A (en)2019-04-05

Similar Documents

PublicationPublication DateTitle
CN109583456B (en) Infrared surface target detection method based on feature fusion and dense connection
WO2020102988A1 (en)Feature fusion and dense connection based infrared plane target detection method
CN110084292B (en)Target detection method based on DenseNet and multi-scale feature fusion
CN109886066B (en)Rapid target detection method based on multi-scale and multi-layer feature fusion
CN106650789B (en)Image description generation method based on depth LSTM network
CN107145908B (en) A small target detection method based on R-FCN
CN112488025B (en)Double-temporal remote sensing image semantic change detection method based on multi-modal feature fusion
CN106228162B (en)A kind of quick object identification method of mobile robot based on deep learning
CN116206185A (en)Lightweight small target detection method based on improved YOLOv7
CN110929607A (en)Remote sensing identification method and system for urban building construction progress
CN108647585A (en)A kind of traffic mark symbol detection method based on multiple dimensioned cycle attention network
CN110659601B (en) Dense vehicle detection method for remote sensing images based on deep fully convolutional network based on central points
CN111160249A (en) Multi-class target detection method in optical remote sensing images based on cross-scale feature fusion
CN111178121B (en)Pest image positioning and identifying method based on spatial feature and depth feature enhancement technology
CN106980858A (en)The language text detection of a kind of language text detection with alignment system and the application system and localization method
CN108830285A (en)A kind of object detection method of the reinforcement study based on Faster-RCNN
CN106778835A (en)The airport target by using remote sensing image recognition methods of fusion scene information and depth characteristic
CN114519819B (en)Remote sensing image target detection method based on global context awareness
CN116797787B (en)Remote sensing image semantic segmentation method based on cross-modal fusion and graph neural network
CN106022273A (en)Handwritten form identification system of BP neural network based on dynamic sample selection strategy
CN106228125A (en)Method for detecting lane lines based on integrated study cascade classifier
CN107273870A (en)The pedestrian position detection method of integrating context information under a kind of monitoring scene
CN110909623B (en)Three-dimensional target detection method and three-dimensional target detector
CN116524189A (en)High-resolution remote sensing image semantic segmentation method based on coding and decoding indexing edge characterization
CN109800764A (en)A kind of airport X-ray contraband image detecting method based on attention mechanism

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp